(C) PLOS One

(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .

Comparing and linking machine learning and semi-mechanistic models for the predictability of endemic measles dynamics [1]

['Max S. Y. Lau', 'Department Of Biostatistics', 'Bioinformatics', 'Rollins School Of Public Health', 'Emory University', 'Atlanta', 'United States Of America', 'Alex Becker', 'Department Of Ecology', 'Evolutionary Biology']

Date: 2022-09

Abstract Measles is one the best-documented and most-mechanistically-studied non-linear infectious disease dynamical systems. However, systematic investigation into the comparative performance of traditional mechanistic models and machine learning approaches in forecasting the transmission dynamics of this pathogen are still rare. Here, we compare one of the most widely used semi-mechanistic models for measles (TSIR) with a commonly used machine learning approach (LASSO), comparing performance and limits in predicting short to long term outbreak trajectories and seasonality for both regular and less regular measles outbreaks in England and Wales (E&W) and the United States. First, our results indicate that the proposed LASSO model can efficiently use data from multiple major cities and achieve similar short-to-medium term forecasting performance to semi-mechanistic models for E&W epidemics. Second, interestingly, the LASSO model also captures annual to biennial bifurcation of measles epidemics in E&W caused by susceptible response to the late 1940s baby boom. LASSO may also outperform TSIR for predicting less-regular dynamics such as those observed in major cities in US between 1932–45. Although both approaches capture short-term forecasts, accuracy suffers for both methods as we attempt longer-term predictions in highly irregular, post-vaccination outbreaks in E&W. Finally, we illustrate that the LASSO model can both qualitatively and quantitatively reconstruct mechanistic assumptions, notably susceptible dynamics, in the TSIR model. Our results characterize the limits of predictability of infectious disease dynamics for strongly immunizing pathogens with both mechanistic and machine learning models, and identify connections between these two approaches.

Author summary Machine learning techniques in infectious disease modeling have grown in popularity in recent years. However, systematic investigation into the comparative performance of these approaches with traditional mechanistic models are still rare. In this paper, we compare one of the most widely used semi-mechanistic models for measles (TSIR) with a commonly used machine learning approach (LASSO), comparing performance and limits in predicting short to long term outbreaks of measles, one of the best-documented and most-mechanistically-studied non-linear infectious disease dynamical systems. Our results show that in general the LASSO outperform TSIR for predicting less-regular dynamics, and it can achieve similar performance in other scenarios when compared to the TSIR. The LASSO also has the advantages of not requiring explicit demographic data in model training. Finally, we identify connections between these two approaches and show that the LASSO model can both qualitatively and quantitatively reconstruct mechanistic assumptions, notably susceptible dynamics, in the TSIR model.

Citation: Lau MSY, Becker A, Madden W, Waller LA, Metcalf CJE, Grenfell BT (2022) Comparing and linking machine learning and semi-mechanistic models for the predictability of endemic measles dynamics. PLoS Comput Biol 18(9): e1010251. https://doi.org/10.1371/journal.pcbi.1010251 Editor: T. Alex Perkins, University of Notre Dame, UNITED STATES Received: May 26, 2022; Accepted: August 2, 2022; Published: September 8, 2022 Copyright: © 2022 Lau et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: Data and code are available at https://github.com/msylau/measles_LASSO. Funding: This work is supported by the Dean's Pilot and Innovation Awards provided by the Dean’s office at Rollins School of Public Health at Emory University. B.G and C.J.M gratefully acknowledge financial support from the Schmidt DataX Fund at Princeton University made possible through a major gift from the Schmidt Futures Foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist.

Introduction Mechanistic and semi-mechanistic models have been foundational in developing an understanding of the spread of infectious diseases in human and wildlife populations. These models approximately depict how pathogen transmission is shaped by population dynamics (e.g., how transmission is reduced by herd immunity). Such modeling approaches are essential for understanding the natural history of pathogens transmission and providing insights into designing effective control strategies. While models such as the Susceptible-Infected-Recovered framework are mechanistically well-understood, calibrating them against stochastic, and often partially unobserved, incidence or mortality data is a steep statistical challenge. The primary focus of mechanistic models has been understanding and characterizing the natural history of transmission. In contrast to their mechanistic counterparts, implementations of statistical and machine learning techniques in infectious disease modeling have primarily focused on improving forecasting accuracy without the explicit aim of inferring transmission dynamics. Such approaches have grown in popularity in recent years [1–5], and they also have a long pedigree in terms of using statistical approaches to study measles dynamics [6,7]. Patterns of pre- and post-vaccination measles incidence are among the most well-documented, and well-studied, non-linear systems in ecology. A suite of analyses using deterministic and stochastic (semi-) mechanistic models have illuminated how the interplay between seasonal forcing and susceptible recruitment shape dynamics in large urban populations [8], ranging from simple limit cycles to coexisting attractors [8,9], and even chaos with the domination of stochastic extinction in small highly vaccinated populations [10,11]. A focus of previous analysis has been detailed weekly spatio-temporal notifications of measles from England and Wales (E&W), interpreted with the TSIR model and other inferential approaches, notably particle filtering [12,13]. While partially mechanistic approaches for measles dynamics are being explored [14–16], a more comprehensive comparison between mechanistic and fully statistical approaches is still lacking. Such comparisons would yield insight into the choice of most appropriate modeling techniques given different patterns of data. Measles is an excellent test bed for these questions, given that we have both rich historical notification time series and successful applications of mechanistic and semi-mechanistic models. In this paper, we explore and compare forecasting capability of these two contrasting approaches for both regularly periodic and relatively irregular recurrent measles epidemics in England and Wales between 1944–1994 and in the US between 1932–45. We consider both a semi-mechanistic TSIR model and a fully-statistical model using a popular machine learning (ML) approach (Least Absolute Shrinkage and Selection Operator, the LASSO) which has been applied to other diseases such as Dengue and tuberculosis [17,18]. Our results suggest that the proposed LASSO model, compared to the TSIR model, can efficiently use data from multiple major cities and achieve similar short-to-medium term forecasting performance for more regular measles outbreaks in E&W during the pre-vaccination era (1944–1964). Strikingly, even when trained solely on data with an annual cycle, forecasts in our LASSO framework capture the characteristic annual to biennial bifurcation 1950 driven by a decline in birth rates. When important demographic information (such as the birth rate data) is not included, the LASSO model still performs reasonably well, likely due to the fact these dynamics may have been implicitly incorporated within the approach (see Models and Methods). LASSO may also outperform the TSIR for predicting less-regular dynamics such as those observed in major cities in the US between 1932–45. Although both approaches capture short-term forecasts, accuracy suffers for both methods as we attempt longer-term predictions in highly irregular, post-vaccination epidemics in E&W. Overall, our results show that fitting a LASSO model may both qualitatively and quantitatively rediscover major mechanistic assumptions in the TSIR model. These insights inform the limits of predictability, and the connections of both approaches in infectious disease dynamics for fully-immunizing pathogens.

Discussion Transmission of infectious diseases at the population-level is characterized by inherent, and often complex, non-linear dynamics that are driven by intrinsic and extrinsic factors such as infectivity of pathogens, human behaviors and public health interventions, notably variable contact patterns and vaccinations among host populations, and even environmental factors. Mechanistic and semi-mechanistic models provide biologically plausible and directly interpretable frameworks for modeling such complex dynamics. In contrast, machine learning approaches primarily focus on identifying patterns within the data to improve prediction and forecasting; they include no specific mechanistic framing, and often lack biological interpretability. While machine learning approaches have shown success in forecasting complex epidemiological systems (e.g., dengue [5]), comprehensive and long-term data for these pathogen-human interactions are often lacking, making detailed methodological comparisons challenging. Here, we leverage unique time-series data and a large body of work on semi-mechanistic modeling to develop a full comparison between these approaches using measles as a test case. Our results indicate that a LASSO-based machine learning model can efficiently leverage the detailed historical measles incidence data from multiple locations in E&W to achieve short to medium-term forecasting accuracy that is comparable to one of the mostly commonly used mechanistic model for measles (the TSIR model). Interestingly, our results show that the LASSO model performs similarly even without the knowledge of births that are required by the TSIR model. This suggests that the correlation/dependence structure between birth and incidence can be “absorbed” by a parsimonious LASSO model that only considers historical incidence to infer changes in temporal patterns without explicit knowledge of the cause of these changes (e.g., here, the impact of births on the underlying susceptible population size). As a result, the LASSO model appears to be able to capture the bifurcation in dynamics in 1950, one of the key properties of the measles outbreaks in E&W, without requiring the data driving the change in pattern. We do, however, find that the LASSO forecasts are comparable to those from TSIR only when all data from the major cities are used for model training (S3 Fig). Moreover, while both the LASSO model and the TSIR model do not work well for the highly chaotic dynamics beyond short-term prediction, the LASSO approach may outperform TSIR in the scenarios with a mixture of seasonal and mildly chaotic dynamics (as observed in historical outbreaks in the major cities of US [11]). Finally, our results also show that the LASSO model can reconstruct/discover the mechanistic assumptions of the TSIR model (Fig 6). Our results are consistent with recent work which shows that the TSIR model may be discovered by some partially-mechanistic machine learning approaches that consider higher orders of polynomial terms for transmission dynamics [14]. Compared to their work, our work focuses on out-of-sample prediction (as opposed to focusing on “discovery”). Also, while these approaches require the knowledge of susceptible population (via some pre-processing procedures leveraging TSIR), we do not require reconstructed susceptible population, creating a more explicit test of fully statistical approaches. The effect of susceptible depletion seems to be implicitly captured by the non-positive LASSO coefficient values associated with the more recent lags within the previous year (see Fig 6A and Models and Methods). This feature is likely to explain why the LASSO model may perform reasonably well despite lacking explicit knowledge of births/susceptibles. This analysis represents an initial step and there are several clear directions for future work: while we compared one of the most successful mechanistic models for measles (the TSIR) with a commonly used machine learning approach (the LASSO), other machine learning approaches (e.g., neural-network based models) may yield different results. In particular, recent theoretical work [22–24] has demonstrated excellent predictability for simulated deterministic chaotic systems using a network-based machine learning approach (“reservoir computing”). Such variants could also be extended to and tested on the stochastic measles outbreaks that we consider. Despite this, it is worth noting that the LASSO model is an additive model and relatively interpretable compared to many other machine learning approaches, and thus seems a sensible starting point here. In particular, LASSO’s additive nature allows us to analytically link it to the TSIR model (see Models and Methods). Finally, while our preliminary simulation studies (S5 Fig) further illustrate the predictive power of the LASSO model for measles epidemics, more extensive simulation studies including the investigations of explicitly leveraging spatial information of the epidemics may be considered for future directions. These investigations may shed light on, for example, how machine learning approaches may best complement mechanistic models for modelling less populated places whose dynamics are known to be more stochastically-driven.

Models and methods A mechanistic modelling approach: The TSIR model We model the local measles dynamics using the time-series-Susceptible-Infected-Recovered (TSIR) framework. Balancing births against disease transmission, the TSIR equations are given as (1) and (2) where I t and S t are the number of incident and susceptible individuals in a given biweek t, and B t refers to the number of births in a given biweek. β t is a seasonally repeating contact rate with 26 values per year. The exponent α (typically slightly less than 1) captures heterogeneities in mixing that were not explicitly modelled by the seasonality and the effects of discretization of the underlying continuous time process [25,26]. The TSIR estimates obtained in this manuscript used the recently developed tsiR package [27]. Specifically, in our analysis, α is fixed to be 0.98 [11] and a Gaussian process regression is performed between cumulative cases and cumulative births. Parameter estimates were obtained for each location for each time period of interest. A more extensive description of the TSIR fitting process in terms of theory and implementation can be found in [25,27]. A machine learning approach: The LASSO model We consider modelling incidence k-step ahead of time t for place i (i.e., I i,t+k ) as a linear combination of log-transformed historical local incidence and births. Specifically, we consider (3) where denotes the average of births of the previous T lag biweeks. We consider bi-weekly data and two-year forecasting windows k = 0,1,…,52 and T lag = 130. A separate model is fitted for each forecast window k, using Least Absolute Shrinkage and Selection Operator (LASSO) regression [28]. LASSO is a machine learning technique that simultaneously performs estimations of the regression coefficients and variable selection by shrinking some of the smallest estimated coefficients towards zero. LASSO holds the property of variable selection as it allows a coefficient to be shrunk to exactly zero (also known as L-1 regularization). Compared to the traditional regression technique the Least Squares Estimates (LSE), this shrinkage process has the effect of significantly reducing variance of model prediction and is the key for improving model fit. We also consider a LASSO model without explicit inclusion of the births (the last term in Eq 3). In particular, LASSO estimates are the values of coefficients that minimize an objective function (4) where y i,t+k is the response variable, p = 2×T lag and n is the number of observations. Note that for clarity we have used θ k to denote the kth coefficient in θ (not including the intercept η) and x t,k to denote the covariate associated with it. The penalty term serves as the machinery to allow shrinkage of the coefficient estimates, i.e., the larger the value of λ, the greater the effect of shrinkage. Shrinkage significantly reduces overfitting and variances of predictions, but at the cost of slight increase in bias − and tuning of λ is critical for achieving an ‘optimum’ (often measured by the test mean squared error) among this bias-variance trade-off. We used ten-fold cross-validation to identify the optimal value of λ. Reconstructing the TSIR mechanism via the LASSO model In this section, we provide additional insights into how our LASSO model can reconstruct/discover a TSIR model. Taking the log of both sides of TSIR model (Eq 1), we have (assuming I k is much smaller than B k ) Note that, under the TSIR, we have α≥0 (and slightly less than 1) and c≤0, which respectively indicate positive association and negative association of the corresponding lagged incidence I k with the current incidence. The negative association indicated by the parameter c may implicitly capture the impact of susceptible depletion. Should the LASSO model (Eq 3) resemble the TSIR, we would expect to see a tendency towards positive coefficients ψ J associated with the most recent lagged incidence (corresponding to the positive α value) and non-positive coefficients at other lagged incidence (corresponding to the non-positive c value). Also note that historical births may be absorbed in the intercept term of the LASSO model. We stress that we are not aiming to draw close equivalence between the TSIR and our LASSO model. Instead, this framing aims to provide some insights into the question that how these two modeling approaches may be interconnected heuristically.

Acknowledgments We thank Dr. Chuan Gao for his constructive comments on an early draft of this paper.

[END]
---
[1] Url: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010251

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/