(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org.

(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org.
Licensed under Creative Commons Attribution (CC BY) license.
url:https://journals.plos.org/plosone/s/licenses-and-copyright

------------

Systematic review of predictive models of microbial water quality at freshwater recreational beaches

['Cole Heasley', 'School Of Occupational', 'Public Health', 'Ryerson University', 'Toronto', 'Ontario', 'J. Johanna Sanchez', 'Jordan Tustin', 'Ian Young']

Date: 2021-09

Abstract Monitoring of fecal indicator bacteria at recreational waters is an important public health measure to minimize water-borne disease, however traditional culture methods for quantifying bacteria can take 18–24 hours to obtain a result. To support real-time notifications of water quality, models using environmental variables have been created to predict indicator bacteria levels on the day of sampling. We conducted a systematic review of predictive models of fecal indicator bacteria at freshwater recreational sites in temperate climates to identify and describe the existing approaches, trends, and their performance to inform beach water management policies. We conducted a comprehensive search strategy, including five databases and grey literature, screened abstracts for relevance, and extracted data using structured forms. Data were descriptively summarized. A total of 53 relevant studies were identified. Most studies (n = 44, 83%) were conducted in the United States and evaluated water quality using E. coli as fecal indicator bacteria (n = 46, 87%). Studies were primarily conducted in lakes (n = 40, 75%) compared to rivers (n = 13, 25%). The most commonly reported predictive model-building method was multiple linear regression (n = 37, 70%). Frequently used predictors in best-fitting models included rainfall (n = 39, 74%), turbidity (n = 31, 58%), wave height (n = 24, 45%), and wind speed and direction (n = 25, 47%, and n = 23, 43%, respectively). Of the 19 (36%) studies that measured accuracy, predictive models averaged an 81.0% accuracy, and all but one were more accurate than traditional methods. Limitations identifed by risk-of-bias assessment included not validating models (n = 21, 40%), limited reporting of whether modelling assumptions were met (n = 40, 75%), and lack of reporting on handling of missing data (n = 37, 70%). Additional research is warranted on the utility and accuracy of more advanced predictive modelling methods, such as Bayesian networks and artificial neural networks, which were investigated in comparatively fewer studies and creating risk of bias tools for non-medical predictive modelling.

Citation: Heasley C, Sanchez JJ, Tustin J, Young I (2021) Systematic review of predictive models of microbial water quality at freshwater recreational beaches. PLoS ONE 16(8): e0256785. https://doi.org/10.1371/journal.pone.0256785 Editor: Zaher Mundher Yaseen, Ton Duc Thang University, VIET NAM Received: May 4, 2021; Accepted: August 14, 2021; Published: August 26, 2021 Copyright: © 2021 Heasley et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: All relevant data are within the paper and its S1–S9 Tables and S1 Protocol files. Funding: IY and JT received funding for the project by the Public Health Agency of Canada: https://www.canada.ca/en/public-health.html. Grant number: 2021-HQ-000017. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. List of abbreviations: AUC, Area under the curve; AUROC, Area under the receiver operator curve; CHARMS, CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies; FIB, Fecal indicator bacteria; Fn, Fourier transform; LASSO, Least absolute shrinkage and selection operator; NSE, Nash-Sutcliffe efficiency; PBIAS, Percent bias; PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses; qPCR, quantitative polymerase chain reaction; RMSE, Root mean squared error; U.S., United States

Introduction Between 2000 and 2014, 140 outbreaks were reported in 35 states and a territory in the United States (U.S.) in untreated recreational water sources, leading to 4958 cases of waterborne disease, with 84% of the outbreaks associated with a lake, pond, or reservoir [1]. However, when accounting for non-outbreak linked cases, underreporting, and missing state data, the estimate for total water-borne illness from recreational surface waters in the U.S. is around 90 million cases annually, costing $2.2-$3.7 billion USD in healthcare services [2]. Routine monitoring for water-borne pathogens is infeasible at recreational beaches, therefore, fecal indicator bacteria (FIB) are sampled as a marker of potential pathogen concentrations and risk of infection to bathers. There are many pathogens that are spread via recreational water use that can cause recreational water illness, including enteric viruses (e.g. norovirus, adenovirus) and bacterial and protozoal pathogens (e.g. Campylobacter, Salmonella, Cryptosporidium) [3, 4]. E. coli is often used as the indicator for the presence of these pathogens in freshwater beaches [5]. Enterococcus is occasionally used as an indicator in addition to or in place of E. coli, most commonly in marine waters [6–8]. E. coli is often a preferred indicator in freshwater sources due to its strong association with the risk of gastrointestinal illness in bathers [5, 9]. Decisions on whether to close or post beaches as potentially unsafe for swimming due to water quality concerns are conducted by public health officials or other beach managers. Traditionally, these decisions are based on evaluating whether FIB levels in beach waters exceed health-action threshold values. This approach has been termed the “persistence model” of beach management, because it typically relies on culture-based laboratory assessments of FIB counts which require 18–24 hours to obtain a result, leading beach managers to make water quality decisions using the previous day’s measurements. More modern genetic techniques, such as qPCR, can achieve results in 3–4 hours, but are costly for beach management and laboratories to run daily [10]. Some beach managers have moved to forecasting FIB levels using predictive models. These models typically use environmental inputs such as temperature, precipitation, and turbidity to predict FIB levels at beaches on a given day, which can then be validated and assessed with the subsequent FIB lab results [11, 12]. A wide variety of predictive modelling methods have been used at recreational beaches; including multiple linear regression [13, 14], artificial neural networks [15], and Bayesian networks [16]. These models use local weather and environmental data, collected from various sources, that are associated with FIB concentrations in the water [6, 17]. Given the variety of predictive modelling approaches and applications published to-date, there is a need to identify and describe existing approaches, trends, and their accuracy to inform beach water management policies. The purpose of this systematic review was to identify and summarize modelling methods used, where they have been applied, and their performance in correctly predicting beach water quality to support management decisions (e.g., posting a beach as unsuitable for swimming due to poor water quality). The review was conducted as part of a larger study to examine environmental influences on freshwater beach quality in Canada. Therefore, we have focused the scope on models developed for freshwater, recreational sites located in a temperate climate. To our knowledge, no systematic review exists on predictive models of fecal indicator bacteria at freshwater recreational sites in temperate climates.

Methods Review question and eligibility criteria The protocol for this review was created in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Protocol 2015 checklist [18]. The remainder of this review was written using the PRISMA 2020 statement [19]; a PRISMA checklist is located in S1 Table. A review protocol was developed a priori following Cochrane Collaboration review guidelines (see S1 Protocol) [20]. However, the protocol was not registered with any databases. The research questions were: 1) what types of predictive models were created for predicting FIB concentrations based on environmental variables for freshwater beach management decisions? 2) which predictors were included in these models? 3) how accurate are the models in determining if recreational water quality exceeds guideline recommendations? Our eligibility criteria followed the PECO approach: Population, Exposure, Comparison, and Outcome. Our population of interest included freshwater beaches in temperate climates that are used for recreational purposes. Therefore, we excluded models focusing on coastal and estuarial waters, and waters not used for recreation (drinking water sources). Our exposure of interest included environmental data that can be collected in real time to support beach water monitoring, such as weather parameters and water conditions. We included models that compared accuracy to their original dataset, to persistent models, and that used other validation methods (e.g., bootstrapping). Our outcome of interest was FIB levels. Models predicting algal blooms were excluded. We included publications reporting on the development and/or evaluation of predictive models, reported in journal articles, conference proceedings, thesis and dissertations, and government reports. Reviews and commentary articles were excluded. Search strategy We designed a comprehensive search strategy in collaboration with a research librarian. The following databases were used to search for relevant articles: Medline via OVID, SciTech Premium, Scopus, Web of Science, and ProQuest Dissertations and Thesis Global. The search terms used in each database are provided in S2 Table. As an example of the search terms used, the search in Scopus was: (Escherichia coli OR enterococc* OR fecal indicator bacteria) AND (regression analysis OR predict* OR nowcast* OR forecast* OR model*) AND ("fresh water" OR recreational water OR beach* OR lake OR river) AND (weather OR monitor* OR rain* OR environmental). All articles published until the search date, December 15, 2020, were included with no publication date restrictions. A grey literature search was also conducted and involved searching nine targeted government websites from December 10–14, 2020. A list of websites searched is available in the S3 Table. To ensure all relevant publications were captured, reference lists of relevant articles were hand-searched for additional potentially relevant articles. Relevance screening Citations identified by the searches were stored in a Mendeley database (Elsevier, Amsterdam, Netherlands), deduplicated, and then uploaded into DistillerSR (Evidence Partners, Ottawa, Canada). All articles were independently screened twice by CH and JS in two levels of screening: title and abstract screening (Level 1) and full article screening (Level 2). Level 1 screening involved the question: Is this reference potentially relevant to our review? (Yes/No). Level 2 involved three screening questions: Is this article about microbial water quality? (e.g., measuring E. coli, Enterococcus). Is this article about freshwater, recreational beaches in a temperate region? Does this article report on a predictive model for beach water quality using environmental data? (Yes/No for all). Beaches were defined as any site intended for primary water contact activities (e.g., swimming, wading, water sports) to capture all recreational water sites. All screening forms were created prior to any screening and pre-tested by two reviewers screening 50 articles and discussing discrepancies. Pre-testing of Level 1 screening resulted in a kappa score of 0.76, after which the reviewers discussed their conflicts and agreed to proceed with independent reviewing after improving clarity on how to apply the eligibility criteria. Questions for level 2 were discussed prior to screening and tested on five articles by both reviewers to ensure consistent interpretation and clarity of the questions. Data characterization and extraction Articles passing the screening process were obtained as full-texts and data were extracted using a pre-specified and pre-tested form. Data were extracted by CH into a form in DistillerSR, which can be found in S5 Table. The form included 20 questions that collected information such as location details of beaches, length of study, type of predictive model, variables explored in making the model, performance metrics of the model, and risk-of-bias. Data extraction results were independently validated by JS. Risk-of-bias assessment and data analysis Risk of bias of each relevant article was assessed using the CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies (CHARMS) [21]. We adapted the checklist from human health predictive models to environmental modelling. We considered “participants” to be beach days, and questions relating to human health were removed (e.g., details of treatments, blinding outcomes). Of 21 CHARMS questions, 10 were included in the data extraction form. Questions included sources of data, blinding predictors from outcomes, number and handling of missing data, predictor selection method, predictor transformations, and model validation methods and performance measures. Due to a priori knowledge that many studies collect data from government sources, predictor measurement methods were not included. CHARMS does not score studies based on bias, therefore, we did not determine an overall risk-of-bias score or rating for each study. Data from DistillerSR were downloaded in Excel (Microsoft, Redmond, United States) for analysis, which consisted of descriptive summary tabulations. Data visualizations were also created in Excel. While we report on performance metrics, we do not draw conclusions on validity nor compare models to each other. Meta-analysis was not deemed appropriate for this review given that predictive modelling approaches and performance metrics varied widely across studies.

Discussion This review compiles results of the literature reporting on predictive models of FIB at fresh, recreational waters using environmental predictors. It provides novel insight on key variables of interest, modeling techniques, and considerations of modeling for those looking to create predictive models at other waters. Our review is the first to provide a systematic approach to reviewing the literature in this area. It focuses exclusively on fresh, recreational waters, and further explores the role of various environmental predictors, which is novel to the literature of this type of modelling. de Brauwere et al. reviewed regression and hydrodynamic models predicting FIB in all surface waters in 2014, and provided an in-depth summary of important processes for hydrodynamic models [72]. We similarly found that most relevant studies in this area were conducted in the U.S., despite wider search parameters. Additionally, this review reports on the validation techniques and amount of data used during model building and validation of reviewed studies. As the geology, pollution sources, and climate of beaches differs geographically, building beach-specific models is important for accuracy [13, 65, 72]. Even in the same region, different bodies of water behave differently. For example, Hatfield [43] created an effective model for FIB in Lake Erie, but a similar model for a nearby artificial lake was not successful due to poor efficacy. However, geographically similar beaches within a specific region may be able to be modelled similarly to help reduce resources required to build models [54]. Different beach models may require different modeling approaches and environmental variables, so it is important to explore these elements in new contexts before generalizing models to other beaches. Predictive modelling has the ability to overcome several issues in recreational water monitoring. Firstly, it addresses the reliance on persistence models, where the accuracy of posting beaches as suitable or unsuitable for swimming and other water activities depends on FIB concentrations remaining consistent across the 24-hour lab-response time. It also does not require the large resource and capacity investment of upgrading to qPCR for rapid testing, as most beach managers collect FIB data and government weather and water stations are already set up at or near many recreational waterways, resulting in less investment to collect data to develop and implement models. However, these techniques can still be integrated together. The city of Chicago has adopted a hybrid model for determining beach water quality [73]. The five beaches (out of 20) that produce 56% of poor water quality days are tested with qPCR everyday, with the others placed into clusters, with one beach per cluster tested with qPCR and the rest predicted with models. This hybrid approach identifies poor water quality days three times more accurately than the previous predictive models alone. The rapid testing ensures accuracy, while the predictive models reduce costs and may provide a solution to the shortcomings of both methods. The efficacy of predictive models depends on the quality and accuracy of information put into them. Thirty-seven studies collected at least some of their environmental data from governmental sources, which are likely to be reliable in quality. While they might reflect slightly different weather conditions from beaches, due to being located elsewhere, such small changes are not likely to be a limitation in modelling. Rainfall is an important environmental factor as it washes microbial contamination from urban surfaces and agricultural sources into larger bodies of water, and increases sewer and river discharge [35, 47]. As a result, elevated E. coli levels are often associated with extreme rainfall events [69]. A wide range of timeframes for antecedent rainfall were explored, from a few hours prior to sampling to several days before. For easier interpretation, this review categorized these times as <24 hours, 24 hours, 48 hours, and 72 or more hours. Of the studies that explored times across this range, the most commonly used time in final models was 72+ hours [48, 61, 64]. Some studies also evaluated weighted rainfall variables that emphasized more recent rainfall across a 3-day period. Regardless, when explored in a study, every rainfall variable was included in at least one final model more than 50% of the time, indicating the value of examining and comparing a variety of ways of expressing rainfall. After rainfall, turbidity was the most frequently included variable in at least one final model. It’s importance relates to the association of bacteria with sediments and particulate suspended solids [74]. As UV radiation can kill E. coli, higher turbidity can protect the bacteria by absorbing or scattering solar radiation [75]. The importance of sand-associated FIB was shown at a beach in Lake Huron, where erosion of sand was the main source of E. coil from the foreshore to surface water, mediated by wave height [76]. Larger waves may also be responsible for washing bird fecal matter from the beach into the water [54]. Wind direction and speed are important explanatory variables as they are associated with driving FIB from sediments or point sources towards the beach [77, 78]. Winds, waves, and turbidity are often correlated parameters, as winds and waves churn sediments which increases turbidity [43, 78]. While explored less often, temporal variables were consistently included in final models, 100% of the time for day of year, day of week, and time of sampling, and 75% of the time for sub-season/month. FIB may accumulate in water bodies over the summer and, on average, increase over time during the bathing season [34]. Depending on characteristics, FIB concentrations may increase as the day progresses [66] or decrease [65] due to solar inactivation. This result is also dependent on enumeration method, as Telech et al. found that time of day was an important predictor of Enterococcus cell counts, but not qPCR results [65]. Pollution sources, such as waterfowl, other bathers, and discharge into the body of water were similarly explored less often but were nonetheless important considerations. Numerous modelling techniques and predictor selection methods were utilized in this review. Multiple linear regression methods were the most popular and were shown to produce accurate predictions. However, other methods may produce more accurate predictions. Comparing models built at different locations with different variables and rates of FIB exceedances would not yield accurate comparisons; however, four studies included in this review compared modelling techniques using the same data and were thus able to compare techniques. The best performing models in these four studies were artificial neural networks [50], Bayesian networks [23], gradient boosting machine (a type of random forest) [30], and a model stacking algorithm that combines two or more models into one prediction [67]. All outperformed regression methods such as ordinary, partial, and sparse partial least squares methods for multiple linear regression, and were more consistent across years and locations. Further research is warranted on these approaches and their utility for implementation in routine beach water quality monitoring. Predictor selection was also varied, but no comparisons of methods were conducted. However, seven studies (13%) used the Virtual Beach tool, created by the U.S. Environmental Protection Agency, which is intended to aid researchers and beach managers in creating predictive models [79]. The tool allows users to upload data, explore relationships among variables, transform variables, use different regression-based modelling techniques (including a recent addition of a gradient boosting machine), and evaluate models based on several model fit characteristics. The tool is free and designed to be user-friendly to support implementation of modelling at more beaches. While a gradient boosting machine was added, it still relies on regression techniques. Models created by the tool outperformed persistence models in some studies [27] but not others [37]. A few key limitations in the literature were found in the risk-of-bias. For instance, 22 studies validated their models by refitting the model through the original dataset that built the model without internal validation (bootstrapping or cross-validation), which increases the risk of overfitting [21]. Furthermore, only 13 studies (25%) specified whether or not modelling assumptions were met, which could impact model accuracy and reliability. Lastly, 37 studies (70%) did not provide any information about how missing data were dealt with, which raises additional concerns about reliability of the models. The risk of bias checklist, CHARM, required several modifications for this review compared to it’s intended context of human health outcomes. A checklist intended for systematic reviews of non-health related predictive models would benefit future reviews and improve reporting of risk of bias information when creating predictive models in this research area. The goal of predictive models is to produce more accurate results than persistence models, using the previous day’s FIB measurement for current day decisions. Most models included in this review outperformed persistence models to varying degrees, in terms of sensitivity, specificity, and/or accuracy, supporting the use of predictive models in management decisions [27, 35, 64, 70, 80]. However even if models are used for management decisions, routine water sampling for FIB should still be conducted to ensure models remain valid, and are updated and refined as appropriate, across seasons. To ensure models are up to date, the U.S. Geological Survey suggests that beach managers update their predictive models before every new bathing season [27, 70], which may not always occur in practice [81]. Once an accurate model is created, their use by beach management or the public to make decisions regarding recreational activities requires a user-friendly interface. The U.S. Geological Survey Great Lakes NowCast [81] provides real-time estimates of beach water quality along Lake Erie and Lake Ontario to the public [81]. Built from the Ohio NowCast system, several studies in this review were used in developing this tool [35, 36, 38]. The predictive models created for the Cuyahoga river were also added into the Ohio NowCast [27, 28]. The website allows users to examine current and past conditions, and also explains factors in the model. The Philly Rivercast [82] provides nowcasts for the Skullykill River and it’s development was outlined by Maimone et al. [49]. These platforms are used by beach managers and the public, which allows authorities to make real-time water quality decisions easily, and the pubic to learn about beach postings prior to arrival and make decisions about whether or not to swim or engage in other recreational activities at the beach. Additionally, as seen with the Great Lakes NowCast, these platforms can be modified and scaled to include new beaches as appropriate. There were several limitations to this study. Firstly, while grey literature was included, only selected government websites were searched. Therefore, we could have missed some relevant studies. However, our search verification strategy helped to mitigate this potential bias. Lastly, our review was geographically limited to fresh, recreational waters in temperate regions, excluding models created for marine, tropical and subtropical waters. Predictive models in those settings may have different environmental predictors and performance.

Conclusions This review is the first to systematically examine literature on predictive models for FIB levels in fresh, recreational waters. The review reports on 53 relevant articles extracted from five databases. We have highlighted commonly explored and frequently used environmental variables and modelling techniques that can inform future predictive modelling projects and options for beach managers. Rainfall, turbidity, wind, and wave height were most commonly incorporated into final models, and most models used linear regression. Evidence supports use of real-time models of FIB levels as an indicator of water quality rather than or in addition to using persistence models. At locations with consistent monitoring of FIB, predictive models can improve the effectiveness and response times of risk communication with beachgoers about recreational water quality risks, which can help to potentially reduce water-borne illness. A risk of bias checklist was adapted for this review and identified common limitations in the literature. Future research may benefit from a risk of bias checklist intended for non-medical predictive models. This review provides insight for researchers and beach managers interested in creating their own predictive models in terms of key variables, modelling approaches, and bias-reduction techniques to consider. More research should be conducted to evaluate the effectiveness and utility of more advanced predictive modelling approaches such as artificial neural networks, Bayesian approaches, and other machine learning methods.

Acknowledgments We would like to thank Cecile Farnum, a research librarian at Ryerson University, for assistance with the search strategy.

[END]

[1] Url: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0256785

(C) Plos One. "Accelerating the publication of peer-reviewed science."
Licensed under Creative Commons Attribution (CC BY 4.0)
URL: https://creativecommons.org/licenses/by/4.0/

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/