(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org.
Licensed under Creative Commons Attribution (CC BY) license.
url:
https://journals.plos.org/plosone/s/licenses-and-copyright
------------
OutbreakFlow: Model-based Bayesian inference of disease outbreak dynamics with invertible neural networks and its application to the COVID-19 pandemics in Germany
['Stefan T. Radev', 'Institute Of Psychology', 'Heidelberg University', 'Heidelberg', 'Frederik Graw', 'Bioquant - Center For Quantitative Biology', 'Simiao Chen', 'Heidelberg Institute Of Global Health', 'Chinese Academy Of Medical Sciences', 'Peking Union Medical College']
Date: 2021-11
Mathematical models in epidemiology are an indispensable tool to determine the dynamics and important characteristics of infectious diseases. Apart from their scientific merit, these models are often used to inform political decisions and interventional measures during an ongoing outbreak. However, reliably inferring the epidemical dynamics by connecting complex models to real data is still hard and requires either laborious manual parameter fitting or expensive optimization methods which have to be repeated from scratch for every application of a given model. In this work, we address this problem with a novel combination of epidemiological modeling with specialized neural networks. Our approach entails two computational phases: In an initial training phase, a mathematical model describing the epidemic is used as a coach for a neural network, which acquires global knowledge about the full range of possible disease dynamics. In the subsequent inference phase, the trained neural network processes the observed data of an actual outbreak and infers the parameters of the model in order to realistically reproduce the observed dynamics and reliably predict future progression. With its flexible framework, our simulation-based approach is applicable to a variety of epidemiological models. Moreover, since our method is fully Bayesian, it is designed to incorporate all available prior knowledge about plausible parameter values and returns complete joint posterior distributions over these parameters. Application of our method to the early Covid-19 outbreak phase in Germany demonstrates that we are able to obtain reliable probabilistic estimates for important disease characteristics, such as generation time, fraction of undetected infections, likelihood of transmission before symptom onset, and reporting delays using a very moderate amount of real-world observations.
Emerging infections and epidemic outbreaks are associated with large uncertainties concerning data integrity that challenge the timely detection of disease characteristics and dynamics. Robust parameter inference for mathematical models aiming to describe these dynamics is essential to predict the progression of an epidemic and inform on appropriate public health interventions. In this study, we present a novel method based on invertible neural networks that allows inference of important epidemiological characteristics in case of limited data, thereby allowing for reliable uncertainty quantification. The method circumvents common challenges associated with sparse data by using simulation-based training of an expressive generative neural network. Applying our method to data of the early Covid-19 epidemic in Germany, we are able to obtain reliable estimates on important disease characteristics, such as the proportion of infected individuals remaining undetected, despite limited observations during early outbreak dynamics.
Funding: SR was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation; grant number GRK 2277 "Statistical Modeling in Psychology") and the Covid-19 Research Grants by Google Cloud. SC was supported by Bill & Melinda Gates Foundation (Project INV-006261) and Alexander von Humboldt-Foundation. FG was supported by a Fellowship from the Chica and Heinz Schaller Foundation. UK was supported by Informatics for Life funded by the Klaus Tschira Foundation and by the Cluster of Excellence STRUCTURES funded by DFG. TB was supported by the Alexander von Humboldt Foundation through the Alexander von Humboldt Professor award, funded by the German Federal Ministry of Education and Research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Copyright: © 2021 Radev et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
We demonstrate the feasibility of our method by analyzing public Covid-19 data for Germany and the individual German federal states based on the reported daily number of infected, recovered and deceased cases during the first months of the pandemic. Our neural network is trained using simulations from a customized SEIR-model variant [ 18 ], in combination with an observation model accounting for the differences between true and reported case numbers, and an intervention model describing the intervention measures for prevention and control imposed by German authorities [ 8 ]. Despite the limited number of measurements and the considerable complexity of the model with 34 unknown parameters in total, our network manages to extract information for more than half of the parameters. Credibility intervals of our parameter estimates are well in line with independently published results, and re-simulations starting at our estimated parameters reproduce the observed time series very well. In particular, our inference suggests that approximately four fifths of all infectious individuals remained undetected across all German federal states.
In order to overcome the limitations of individual parameter estimation methods, our approach aims to combine the advantages of optimization-based and Bayesian methods by using specialized invertible neural networks. In particular, we develop a novel methodological framework based on a neural network architecture called BayesFlow [ 17 ] to facilitate model-based inference with a primary, but not exclusive, focus on epidemiology. Our method can incorporate an arbitrary number of epidemiological time series (or other type of temporal information) and can, in principle, be applied to any dynamic model described by (stochastic) ordinary differential equations. Moreover, since the length of the time series is not pre-determined during training and inference, we can consider additional information in our analyzes without having to re-train the networks. By using extensive simulation-based training, our method circumvents the general necessity for large training sets that are lacking during emerging epidemics. Additionally, our method returns posterior distributions which are fully compatible with a Bayesian interpretation and can thus be used to assess the uncertainty associated with any estimation and prediction quantity.
In contrast to optimization and maximum likelihood approaches, Bayesian methods provide a principled way to quantify the uncertainty of inferred model parameters, as they return full posterior distributions for the unknown parameters rather than single point estimates. Markov chain Monte Carlo (MCMC) sampling represents one of the classical approaches to Bayesian model calibration [ 12 ], and it has been extensively used in Covid-19 studies to infer model parameters describing the dynamics of the disease [ 3 , 4 , 8 , 13 ]. However, Bayesian model calibration is computationally expensive and depends explicitly on the evaluation of the likelihood function of model parameters. When the likelihood function is intractable or unknown, approximate Bayesian computation (ABC) can be used to approximate the posterior distribution of parameters [ 14 , 15 ]. However, standard ABC methods notoriously suffer from poor scalability (i.e., the efficient application to large data sets and complex models), which confines their utility to relatively simple models containing only a few parameters [ 16 ].
Importantly, since such SIR-type models are employed to forecast the dynamics of an epidemic dependent on public interventions or seasonal effects, reliable inference of such key epidemiological parameters and trustworthy uncertainty quantification is paramount to support decision making. The estimation of hidden model parameters from observations of model outcomes (inverse inference) is also referred to as model calibration in the medical decision and health policy modeling literature [ 10 ]. Traditionally, model calibration has been considered as an optimization problem seeking the best possible parameter configuration explaining the data (e.g., by performing non-linear least squares minimization or relying on maximum likelihood estimation [ 11 ]). However, the resulting optimization and maximum likelihood methods focus on point estimates for the individual parameters and usually lack appropriate approaches to assess their accuracy. This is a severe disadvantage, because reliable uncertainty quantification is crucial when the estimates shall be used to model and predict future outcomes.
However, for newly emerging infectious diseases, such as Covid-19, most of these properties are initially unknown and must be estimated before realistic predictions can be made. The task of determining these properties is additionally hampered by limited data availability and integrity within the early outbreak phase. During the initial outbreak of the Covid-19 pandemic, model-based inference was used to provide rapid estimates of key epidemiological parameters, which otherwise can be difficult to infer directly from primary clinical tracing data. For instance, an earlier study [ 2 ] incorporated domestic and international travel from and to Wuhan city in a SEIR model and used reported cases outside of Wuhan to infer the reproduction number R 0 and epidemic doubling time. Similar approaches were used to estimate the reproduction number of Covid-19 in various other settings [ 3 – 5 ]. Other studies aimed at identifying critical epidemiological characteristics, such as age-specific mortality rates [ 6 ], the impact of implemented control measures on disease transmission [ 7 , 8 ], or the fraction of undocumented infections [ 9 ].
Assessing important disease characteristics and transmission dynamics is of utmost importance in the case of new epidemic outbreaks in order to forecast their progression and guide effective public health measurements. Mathematical models that provide a reliable representation of the processes driving the dynamics of an epidemic are an essential tool for this task (see for example [ 1 ]). In the case of communicable diseases, these models typically take the form of systems of ordinary differential equations governing the transitions between different population compartments, such as, “Susceptible”, “Infected”, and “Recovered” (SIR) [ 1 ]. Provided that intrinsic properties of the disease (e.g., transmission rates and recovery periods) are already known, SIR models and their extensions are successfully used to simulate outcomes of possible public health interventions.
In addition to future predictions, posterior predictive re-simulations are also crucial for further model validation. If the estimated posteriors describe the real situation well, re-simulations should replicate the data that served to fit the model in the first place. Mismatches between original and re-simulated time series indicate model misspecification or a simulation gap, that is, errors arising because important aspects of the disease dynamics or unknown corruption of the observed data are not properly represented by the model. These errors are invisible to simulation-based calibration, because it solely assesses whether the posteriors conform to model specifications. Importantly, our OutbreakFlow experiments demonstrate very good agreement between original and re-simulated time series.
Although this quantity is hard to compute exactly, we can approximate it by running the simulation with parameters sampled from the posterior: . Since the θ (m) are drawn from the joint posterior, statistical dependencies and correlations between parameters are properly taken into account. The resulting ensemble of M simulated time series can now be used to obtain point predictions (e.g., by computing their mean or median at each future time point) and to quantify the uncertainty of future scenarios (e.g., by computing quantiles or standard deviations at each time point).
The posteriors estimated by a trained OutbreakFlow network can be further used to make forecasts about the future dynamics of the pandemic, provided the parameters remain stationary. Due to changing intervention measures, population behavior, testing policies, and possible treatment advances, this is only true for a relatively short period beyond the observed time series, limiting the prediction horizon to a few weeks. Given observed time series X ≔ x 1:T , the posterior predictive distribution for upcoming data X′ ≔ x T+1:T′ is given by: (15)
A major disadvantage of SBC is that it can be extremely time-consuming, since it requires inverse inference on potentially thousands of simulated time series. In addition, the obtained posterior draws should exhibit no autocorrelation for SBC to yield reliable results. The latter requirement makes it even more expensive when using MCMC or other non-amortized Bayesian methods yielding dependent samples. Fortunately, amortized inference with OutbreakFlow alleviates these issues, since the inference phase on multiple time series is extremely efficient and posterior draws are independent given perfectly converged networks [ 17 ]. Thus, validating the computational faithfulness of any BayesFlow application using SBC becomes a matter of seconds.
In practice, we approximate this integral by an ensemble of samples from many posterior distributions estimated from simulated time series with known generating parameters. SBC uses a rank statistic (i.e., the number of posterior draws larger than the prior draw for each simulated time series) to compare the average posterior with the prior. If Eq 14 holds, then the rank statistic of each parameter will be uniformly distributed, allowing us to visually examine the equivalence using univariate histograms. An inspection of the rank histograms thus provides a way to validate the computational faithfulness of OutbreakFlow within the scope of the modeling assumptions [ 29 ].
Random draws from p(θ, X) are generated by first sampling a configuration θ from the prior p(θ) and then running the (stochastic) simulator with the sampled parameter configuration to obtain a synthetic outbreak trajectory. This process can be repeated multiple times and does not require an analytically tractable likelihood function. Importantly, if a Bayesian sampling method generates samples from the exact posterior, the equality implied by Eq 14 should hold regardless of the particular form of the posterior. Thus, any violation of this equality indicates some error incurred by the sampling method. The reasons for such an error can be either inaccurate computation of the posterior or an erroneous implementation of the model itself [ 29 ].
Computational faithfulness refers to the ability of a Bayesian method to recover the correct target posterior in a given modeling scenario. It is an essential precondition for carrying out model-based predictions and interpreting the parameters of a model within a reference theoretical framework. We can estimate the computational faithfulness of any BayesFlow application using simulation-based calibration [ 29 , SBC,]. SBC is a diagnostic method which considers the performance of a sampling algorithm over the entire Bayesian joint model p(θ, X), regardless of the structure of the particular model. It leverages the fact that most Bayesian models are generative by construction as well as the self-consistency of the Bayesian joint model. Accordingly, the average posterior distribution over random draws from the Bayesian joint distribution (θ, X) ∼ p(θ, X) should always recover the prior distribution p(θ). In other words, for any given parameter combination θ*, the following should hold: (14)
Offline and online learning represent two extremes on a continuum of training strategies. Hybrid learning methods combine these two strategies and allow for a more fine-grained allocation of the available simulation budget. For instance, [ 24 ] propose a round-based strategy, where each round incorporates its own simulation phase. Thus, the reference table is filled incrementally, and each round can reuse simulations from all previous rounds. Such a round-based training strategy is outlined in S3 Algorithm in S1 Text .
Instead of pre-computing synthetic data, we can generate a potentially limitless number of training pairs (θ, X) on-the-fly. Since each simulation result is discarded after the corresponding backpropagation update, the network never encounters the same inputs twice and overfitting is impossible. Moreover, the training phase can continue as long as the network keeps improving, as measured by continuous performance monitoring. Online training is outlined in S2 Algorithm in S1 Text and is used for all experiments in this work. The present application lends itself to this approach, because the computational cost of running our epidemiological model is negligible, whereas more expensive simulations might become a bottleneck for this strategy.
Traditional simulation-based approaches utilize a pre-computed reference table , which is a large data structure containing S pairs (θ, X) of simulation parameters θ and corresponding synthetic observations X [ 25 , 26 ]. This strategy has also been used in machine learning approaches to simulation-based inference [ 27 , 28 ], where the problem of inverse inference resembles a supervised learning task. In the context of OutbreakFlow, the resulting offline learning method is outlined in S1 Algorithm in S1 Text . It is particularly useful when calls to the simulator are computationally expensive: working with recorded synthetic data is then faster at the expense of higher memory demands during training.
An OutbreakFlow is trained with simulated data by minimizing the negative log posterior according to Eq 3 . The training phase can be realized in different ways, depending on the modeling scenario and the modelers’ computational budget. First of all, when only a single time series has to be analyzed, non-amortized methods like [ 23 , 24 ] may outperform OutbreakFlow, because they constrain the simulation scope to the vicinity of the observed data. On the other hand, when the model has to be applied to multiple observed time series (e.g., to different federal states in Germany or even countries), our upfront training effort quickly amortizes, since a trained OutbreakFlow executes inference orders of magnitude faster than a case-based (non-amortized) method. We now outline three training modes for OutbreakFlow.
The observation model accounts for the fact that officially reported cases might not represent the true case numbers of the epidemics. It represents three error sources: a delay between actual infection and reporting, the weekly modulation of reporting rates (since testing and reporting activities are considerably reduced on weekends), and a noise term describing random fluctuations. Separate parameter sets are learned for each of the three publicly reported time series I (obs) , R (obs) , and D (obs) —the remaining compartments are considered unobservable. The relationship between the reported counts and their true values is described by the following set of discrete-time difference equations with time steps t measured in days. (10) (11) (12) where L I , L R , and L D denote the reporting delays (lags), and denote σ I , σ R and σ D the scales of multiplicative reporting noise for the respective compartments. The noise variables ξ t follow a Student-t distribution with 4 degrees of freedom. The weekly modulation of reporting coverage for each of the compartments is computed as follows: (13)
The intervention model accounts for changes in the transmission rate λ(t) due to non-pharmaceutical interventions and mitigation measures. Corresponding to the approach followed by [ 8 ], we define three change points for λ(t) encoding an assumed transmission rate reduction in response to intervention measures imposed by the German authorities. Each change point is represented by a piece-wise linear function with three degrees of freedom: the effect strength and the boundaries defining the time interval for the effect to fully manifest itself. The chosen priors express the expected effect of each measure to reduce the transmission rate roughly by half after the date when it comes into force, but we allow for wide uncertainty margins to facilitate learning of the actual behavior. In addition to the previous approach in [ 8 ], our model also includes a fourth change point expressing the assumption that an eventual withdrawal of effective intervention measures (officially or due to non-compliance) will lead to a slight increase of the transmission rate. Note that we assume that interventions do not affect the risk of infection upon contact with a detected infectious individual (β). Prior distributions and descriptions for all parameters are given in S2 Table in S1 Text .
The meaning of the model parameters and their priors are detailed in S1 Table in S1 Text . Prior ranges are based on considerations in [ 8 ] and [ 22 ]. All rate parameters are considered to be constant, except for the transmission rate λ(t) which is considered to be time-dependent as it accounts for possible behavioral changes implied by non-pharmaceutical interventions.
The model is of SEIR-type with six compartments: susceptible (S), exposed (E, i.e. infected but non-infectious), carrier (C, i.e. infectious but undetected), infected (I, i.e. infectious and diagnosed), recovered (R) and dead (D) individuals. In addition, the blue boxes indicate model extensions accounting for external factors, namely intervention measures affecting the transmission rate λ(t) and imperfect case reporting due to noise or delays. Note, that data is only reported for the observable compartments I, R, and D. For a detailed description of the model and the different components see Materials and methods .
The disease model is a system of non-linear ordinary differential equations (ODEs) distinguishing between six compartments: susceptible (S), exposed (E—infected individuals who do not show symptoms and are not yet infectious), infected (I—symptomatic cases that are infectious), carrier (C—infectious individuals who recover without being detected), recovered (R), and dead (D), see Fig 2 . Note that direct recovery from the carrier state C covers all reasons why an infection might go undetected, including, among others, truly asymptomatic cases, lack of follow-up on pre-symptomatic cases, limited testing capacity under minor symptoms—our model does not differentiate between these posibilities. Observations with limited accuracy (as described by the observation model below) are available for the compartments I, R, and D. The true time series of all compartments are therefore considered latent and need to be estimated.
In order to account for the specific nature of the current Covid-19 outbreak, our epidemiological model consists of three submodels: (i) a disease model describing the true dynamics of relevant population compartments; (ii) an intervention model describing the strengthening and relaxation of non-pharmaceutical intervention measures; and (iii) an observation model describing the deviations of publicly reported data from the true values. These models build upon the previous work of Khailaie et al. [ 18 ] and Dehning et al. [ 8 ], who adapted the general SIR approach to the specifics of the Covid-19 pandemic and the situation in Germany. Parameter priors are based on our current state of knowledge about disease characteristics and government measures, but are chosen very wide to prevent them from dominating the information extracted from the actual observations.
As previously mentioned, one of the most important advantages of our method is amortized inference, owing to the fact that we approximate the posterior globally via a single set of network parameters . This is especially advantageous in epidemiological contexts, where the same model is applied in multiple populations (countries, cultures) or at different scales (states, regions), since the same pre-trained model can be repeatedly utilized for different populations and scales. Indeed, in the following real-world application, we demonstrate efficient amortized inference and excellent predictive performance with a single architecture applied simultaneously to epidemiological data from all German federal states.
More formally, let us denote the functions represented by the three networks as g, h, and f. Then the filtering network determines a filtered time series from observed data x 1:T , where the number of time steps T depends on data availability. The summary network turns the output of the filtering network into a fixed-size representation by keeping only the final output vector of the LSTM network, which encodes the accumulated information over all observed time steps. Finally, the inference network generates samples from the parameter posterior by computing with normally distributed random vectors . The parameters of all three networks are optimized jointly during the training phase. Denoting the vector of all trainable network parameters as ϕ, the three networks solve the following optimization criterion (2) (3) where denotes the Kullback-Leibler divergence [ 21 ] between probability density functions p and q. We approximate the latter expectation via its empirical mean over realizations (X, θ) ∼ p(θ, X) obtained via simulations from a forward model.
During inference, the inference network is only evaluated in the inverse direction using conditional information from real observed data passed through the filtering and summary networks. The posterior is approximated by repeatedly sampling from the simple base distribution and applying the inverse of the forward transformation learned during the training phase. Importantly, this method recovers the true posterior under perfect convergence of the optimization method [ 17 ].
The invertible network has two modes of operation. During training, the network is only evaluated in the forward direction and encouraged via a suitable optimization criterion to transform the posterior into a simple base distribution (e.g., Gaussian) from which samples can be easily obtained. Thus, the inference network integrates information from both the prior and the data-generating mechanism (i.e., the implicit likelihood).
A standard LSTM network consists of a cell and three gates. The cell stores the internal memory of the network over arbitrary temporal intervals. The three gates, comprising an input gate, an output gate, and a forget gate, interact in determining the importance of old and new information. Importantly, LSTM networks can deal with sequences of varying length, which enables them to process data whose duration is dictated by data availability and to perform online inference, i.e. process new data instantly as they become available. In contrast to predefined pooling operations (e.g., mean, max, or variance), our recurrent networks learn pooling operations that are adapted to the data and can thus extract potentially much richer representations. In this way, our inference architecture learns to filter and extract the most informative features from the noisy observations in an end-to-end manner. Thus, the user is freed from the difficult (and usually suboptimal) task to hand-engineer suitable data features (summary statistics). Finally, the inference network has the task of inverting the forward model given the information extracted by the convolutional and recurrent networks (see also [ 17 ] for more details on the design of invertible networks for inference).
The output of the convolutional network is a multivariate sequence containing the filtered epidemiological time series. In order to reduce the filtered sequence to a fixed-size vector, we pass it through a long-short term memory (LSTM) recurrent network [ 20 ]. In contrast to standard feed-forward neural networks, LSTM networks incorporate feedback connections which make them ideally suited for processing sequences of observations such as time series.
The design of the convolutional network is inspired by the well-known Inception network, which has shown tremendous success in computer vision tasks [ 19 ]. In particular, our network is implemented as a deep fully convolutional network which applies adjustable one-dimensional filters of different size at each level (cf. Fig 1 ). The intuition behind this design is that filters of different size might capture patterns at different temporal scales (e.g., a filter of size one will capture daily fluctuations whereas a filter of size seven will capture weekly dynamics). This, in turn, should ease the task of extracting informative temporal features for parameter estimation.
During the training phase (orange frame on the left), the assumed epidemiological model is used to simulate time series resembling the observed epidemiological data, based on prior distributions of the unknown parameters. The synthetic time series are used to train the composite neural network consisting of (i) a convolutional filtering network, which extracts relevant features while preserving the temporal structure of the data, (ii) a summary network, which reduces the transformed time series to a fixed-sized vector of maximally informative representations, and (iii) an inference network, which estimates the joint parameter posterior from these data representations. During the inference phase (blue frame on the right), the real epidemiological data are passed to the trained network to infer the posterior distribution of the unknown disease parameters. A full description of the architecture and the methodology is provided within Materials and methods .
Our neural architecture comprises three sub-networks: (i) a convolutional filtering network performing noise reduction and feature extraction on the raw observational data; (ii) a recurrent summary network reducing filtered time series of arbitrary length to statistical summaries of fixed size; and (iii) an invertible inference network performing Bayesian parameter inference, given the learned summaries of the observations. Fig 1 depicts the training and inference phase with our inference architecture and its essential elements.
We propose OutbreakFlow, an instantiation of our BayesFlow architecture that utilizes a novel combination of three jointly trained neural modules to analyze noisy multi-variate time series with potentially long-term temporal dependencies, as are typical in the context of epidemiology. It can process both short and long time series and can thus perform efficient Bayesian updating as new data become available (e.g., on a daily basis in case of Covid-19). Moreover, our method can incorporate additional prior knowledge in the formulation of the underlying generative model.
The latter advantage is called “amortized inference”: a BayesFlow network learns to generalize the training knowledge and can efficiently apply it to multiple real observations without retraining. The network’s training effort thus quickly amortizes over a sequence of inference queries (e.g., time series), in contrast to sampling methods (e.g., MCMC), which cannot leverage experience and require the same large computational effort for every query. In addition, fast amortized inference facilitates model validation by enabling efficient probabilistic calibration and posterior predictive checks on large validation datasets [ 17 ].
Validation experiments reported in [ 17 ] have demonstrated for various model systems, that the BayesFlow method (i) can estimate complex stochastic models of widely varying data types and sizes (e.g., population time series, predator-pray population time series, human response-time data); (ii) outperforms variational or dropout methods for uncertainty quantification; (iii) learns data representations which are more informative than manually selected summary statistics; and (iv) outperforms case-based methods such as ABC and MCMC, whose computations must be re-run from scratch for every observed dataset.
We solve both problems with our recently proposed neural Bayesian inference architecture called BayesFlow, which is explained in full mathematical details in the corresponding methodological work [ 17 ]. The core component of BayesFlow is an invertible neural network which enables a bidirectional flow of information. During the training phase, the invertible network is run in forward direction to learn an accurate model q(θ|X) ≈ p(θ|X) for the posterior distribution of parameters θ given observations X, using a large number of simulated pairs (X i , θ i ) ∼ p(X|θ) p(θ) as training data. During the inference phase, the network makes use of its invertible architecture and operates in the inverse direction to estimate the posterior q(θ|X = x obs ) ≈ p(θ|X = x obs ) for the actually observed data x obs .
Despite being conceptually simple, this formula poses two major challenges in the present setting: (i) Efficient and accurate approximation of the intractable posterior p(θ|X) is challenging; (ii) The likelihood is only implicitly defined via realizations X ∼ p(X|θ) generated by repeatedly running simulations of the underlying epidemiological model with different θ.
Following a Bayesian approach for parameter estimation requires prior knowledge about reasonable parameter ranges [ 12 ]. Combining this prior knowledge with information extracted from the observed data leads to a posterior distribution which is generally narrower than the prior and thus expresses our updated state of knowledge and associated uncertainty for the individual parameters. More formally, let θ be the vector of all unknown model parameters and X ≔ x 1:T = [x 1 , …, x T ] a multivariate epidemiological time series with T individual time points indicating, for instance, the number of infected, recovered and diseased individuals. Then the well-known analytical formula for the posterior according to Bayes’ rule is (1) where p(X|θ) represents the likelihood of observing data X when the true parameters are θ, p(θ) is the prior distribution encoding our knowledge about plausible parameter configurations for θ, and the denominator represents a normalizing constant (usually called the evidence).
The model was applied to epidemiological data on the number of reported Covid-19 cases (infected, recovered and deceased) in Germany and the individual federal states (only infected and diseased) from March 01, 2020 until June 11, 2020. Data were collected from publicly available sources during the same time period and were not subsequently cleaned or corrected in the aftermath. Therefore, all sources of uncertainty remain in the data as they would have in the early days of an ongoing pandemic. Code and scripts for reproducing all results and figures as well as the general framework of OutbreakFlow for training new networks on new models are available at
https://github.com/stefanradev93/AIAgainstCorona .
Results
OutbreakFlow as a research tool for inferring dynamics of emerging epidemics In contrast to mainstream neural network applications, such as image or text analysis, analyzing the dynamics of emerging epidemics poses two major challenges: (i) data are usually sparse, with no large sets of training data available; and (ii) a reliable quantification of the uncertainty associated with neural network outputs, such as estimated parameters, is mandatory to allow for reliable subsequent evaluation of possible scenarios. Standard neural network architectures do not live up to these challenges. Therefore, we developed a novel method on the basis of a neural network architecture called BayesFlow [17] that addresses the aforementioned challenges in two ways: (i) We leverage the epidemiological insight represented by SIR-type models by means of an alternative training procedure using simulated data—simulation-based training; and (ii) we use networks that are specifically designed to perform Bayesian uncertainty quantification over their outputs. In our framework, a large number of plausible hypothetical scenarios simulating the assumed epidemiological dynamics is processed by the neural network until it becomes an expert in the interpretation of epidemiological observations. After completion of the training phase, the available real-world observations are passed to the network, which then estimates full Bayesian posterior distributions for the real-world parameters of interest. The design of our network architecture is depicted in Fig 1. The details of the method, as well as the model architecture are given in Materials and methods. The ultimate goal of our approach is comparable to that of traditional simulation-based Bayesian inference methods, such as ABC. However, our method operates much faster and generalizes, without retraining, to any real-world dataset within the scope of its training expertise [17].
Testing and validation of OutbreakFlow To validate our approach and test its performance in inferring parameter values in epidemiological models, we applied our architecture to a standard SIR-model describing the dynamics of an epidemic. This greatly simplified model is suitable for the initial two weeks of the pandemic and highlights essential properties of our approach. It distinguishes between susceptible, S, infected, I, and recovered, R, individuals with infection and recovery occurring at a constant transmission rate λ and recovery rate μ, respectively. The model is defined by the following system of ODEs: (16) (17) (18) with N = S + I + R denoting the total population number. In addition to the ODE parameters λ and μ, we consider a reporting delay parameter L and a dispersion parameter ψ, which affect the number of reported infected individuals via (19) where and we assume that the number of newly observed cases arises from a negative binomial distribution [30]. In addition, we estimate the initial number of infected individuals I 0 , so the full parameter vector of interest becomes θ = (λ, μ, L, ψ, I 0 ). Priors over the five parameters are given in S4 Table in S1 Text. As a first step, we trained our network on simulations from the simple SIR-type model formulated above above and then applied the network to the number of reported cases from the first 14 days of the Covid-19 pandemic in Germany. The results from this initial study are depicted in Fig 3. First, we observe that our posterior estimates are in line with those reported in a previous modeling study [8], which utilized the same data and a similar model. Second, we note that the SBC plots indicate no systematic biases in the approximate posteriors and thus suggest that the posterior samples are trustworthy (assuming no simulation gap). Finally, the posterior predictive check indicates that the model can accurately reproduce the observed data (Fig 3). PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 3. (a) Bivariate posteriors over the five model parameters obtained by OutbreakFlow (red) and ABC-SMC (blue) from cases reported during the first 14 days of the Covid-19 pandemic in Germany. Results of the two methods are very similar except for the dispersion parameter ψ, where OutbreakFlow achieves superior uncertainty reduction (see text for a brief discussion); (b) Model-based predictions based on the posteriors obtained by OutbreakFlow; (c) Simulation-based calibration (SBC) computed from 10000 simulated pandemic scenarios. The uniformity of the rank statistics indicates that there are no systematic biases in the approximate posteriors and implies trustworthy approximation.
https://doi.org/10.1371/journal.pcbi.1009472.g003 Since Outbreak-Flow is especially designed to tackle complex stochastic models whose estimation usually necessitates simulation-based approaches, we compared its performance to an analysis performed by ABC-SMC, a popular approximate Bayesian computation algorithm based on sequential Monte Carlo sampling [31]. Our benchmark comparison reveals converging results. However, OutbreakFlow implies a much sharper marginal posterior of ψ than ABC-SMC, indicating more uncertainty reduction with regard to the dispersion parameter. Since the SBC plots indicate no overconfidence (i.e., no overdispersion) of the OutbreakFlow posteriors, it is likely that the ABC-SMC algorithm yields an underconfident (i.e., underdispersed) marginal posterior with respect to this one parameter. As for wall-clock running times, the ABC-SMC algorithm converged in 4.1 hours, whereas OutbreakFlow trained with 50, 000 iterations using online learning required 4.3 hours on the same laptop machine without parallel simulations. This similar performance speaks in favor of amortized inference, as the training effort already amortizes after as a few as two data sets. To further test if OutbreakFlow is able to provide a reliable quantification of model parameters, we analyzed if the additional consideration of non-identifiable parameters within the model would affect the calibration or predictive performance of our method. To this end, we included 5 dummy variables u j ∼ Uniform(0, 1) within our model that were not used for the generation of the simulated data, but were later included in the unknown parameter vector θ during training and inference. Performing the same training and inference phase with these 5 additional dummy parameters neither hurts the calibration of OutbreakFlow, nor does it impact inference on the observed time series in a noticeable way, since the posterior estimates for the relevant parameters appear to be unaffected by the dummy parameters (see S1 Text for full results). The same is true for model-based posterior predictions, which underlines the ability of OutbreakFlow to reliably characterize parameter identifiability in case of insufficient data or over-parameterized models.
Inferring epidemiological characteristics from the early Covid-19 pandemic in Germany After validation of the general applicability of our novel approach, we applied OutbreakFlow to data of the Covid-19 pandemic in Germany, analyzing reported cases (infected, recovered and deceased) in the time period from March 01, 2020 until June 11, 2020. These data captured the early dynamics of the emerging epidemic associated with considerable uncertainty and stochasticity with regard to the number of detected cases, as well as the effect of subsequent public interventions. For our analysis, we used an extended SEIR-type model that had been developed recently and distinguishes between detected and undetected carriers of the disease comprising a total of 34 unknown model parameters (see Fig 2 and Materials and methods for a detailed description of the model) [18]. The model was trained on a time-period from March 01 until May 21 using wide prior distributions across plausible parameter ranges from previous literature [8, 22]. The remaining data (three weeks from May 22 until June 11) are then used to assess the predictive value of the model. The observed and predicted dynamics, as well as the marginal posterior distributions of the individual model parameters are depicted in Fig 4 (see S6 Fig in S1 Text Text for simulation-based calibration). Our model was able to recover the observed dynamics and yields good predictions for the future period, with its forecasts having well-calibrated uncertainty bounds for the newly infected, recovered, and diseased cases (see Fig 4). Despite the large number of unknown model parameters and limited data, our analysis indicated considerable reduction in uncertainty in relation to the prior knowledge for most of the model parameters (Fig 4). Standard point estimates (median, mean, MAP mode) and credibility intervals (95%-CI between the 2.5% and 97.5% quantiles of the corresponding posterior) for all 34 model parameters are given in Table 1. PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 4. (a) Posterior predictions and forecasts of new cases obtained by inferring model parameters from epidemiological data available for reported infected, assumed recovered and deaths by Covid-19 for the entire Germany. Cases to the left of the vertical dashed line were used for posterior checking (model training) and cases to the right for posterior forecasts (predictions) on unseen data; (c) Marginal posteriors of all 34 model parameters inferred from data for the entire Germany alongside median and MAP summary statistics. Gray lines depict prior distributions for comparison with the posteriors. Vertical dashed lines indicate posterior medians.
https://doi.org/10.1371/journal.pcbi.1009472.g004 Interestingly, our parameter estimates (cf. Fig 4) are consistent with previous findings about central disease parameters [8]. With the number of undetected cases being one of the most important estimates to assess epidemic-related dynamics, our network estimates a median probability of remaining undetected (parameter α) of 0.63 with the maximum a posteriori (MAP) estimate at 0.79 (95%-CI [0.07–0.91]). Notwithstanding the large uncertainty surrounding the number of undetected cases, the posterior of α is clearly far from uniform (our prior assumption), and peaks well beyond 0.5 (see Fig 4). This estimate is consistent with the results of representative seroprevalence studies in Germany [32, 33], which find that 75% of the sero-positive individuals at the end of the first wave had not been diagnosed with the disease before. An even higher fraction of undetected cases around 80% was reported by seroprevalence studies focusing on the hotspot regions of Gangelt, Kupferzell, and Tischenreuth [34–36]. Together with our estimated case fatality rate (parameter δ) of 4.1% (median) resp. 3% (MAP), this results in an infection fatality rate of about 0.63% (MAP estimates) or 1.5% (median estimates). Additionally, our informative priors for the parameters η and γ are not updated by the observed data. Accordingly, around 3.23 days will typically pass before the infection is detected (95%-CI [1.92–5.55], 1/η). This estimate is in line with results from [37] (around 4 days) and the World Health Organization [38] (5–6 days). Combined with the latent period (i.e., the time in compartment E: median 6.67 days, 95%-CI [3.33–11.1], 1/γ), this leads to a median incubation period until symptom onset or a positive test outcome of around 9.9 days, as originally implied by our priors. If the disease remains undiagnosed throughout (asymptomatic, weakly symptomatic etc.), recovery takes a median number of 4.59 days (95%-CI [2.99–11.11], 1/θ), a time period in which pre-symptomatic and undiagnosed individuals can be responsible for a considerable fraction of infections. Furthermore, if we assume that most infections occur near the end of the carrier stage, that is, after 2.5 days in compartment C, we arrive at a generation time of around 9 days. In conjunction with the delay for reporting infections I of 5.5 days (parameter L I ), this is consistent with the generally acknowledged fact that the success of intervention measures only becomes apparent after around two weeks. For diagnosed individuals, the median recovery period is estimated at 8.06 days (95%-CI [6.13–10.20], 1/μ). Thus, manifestly ill cases have a more severe disease progression than undiagnosed individuals and typically require 1/η + 1/μ = 3.23 + 8.06 = 11.3 days until recovery [39]. The time between diagnosis and death (6.67 days, 95%-CI [3.12–14.3], parameter 1/d) is shorter than in clinical reports, with parameter identification possibly impaired by the estimated reporting delay for disease-associated deaths D of 11.3 days (parameter L D ), which is probably much longer than in reality. From the available time series alone, the model is not able to distinguish a long critical phase with short reporting delay from rapid death with long reporting delay. Nevertheless, it is remarkable how much information about 34 free parameters our networks can extract from seeing only about 70 time steps of real data (see Fig 4). Finally, our results corroborate the timing of intervention measures and the gradual reduction in transmission rate as observed in [8]. According to our estimates, the lifting of measures around May 6 would have led to an approximately 40% increase in the transmission rate, as assumed by our prior. However, since the spreading rate at t 4 has already decreased to a median of 0.09 (95%-CI: [0.05–0.15]), the increase to a median of 0.13 (95%-CI [0.05–0.28]) does not lead to an exponential growth of infections.
[END]
[1] Url:
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009472
(C) Plos One. "Accelerating the publication of peer-reviewed science."
Licensed under Creative Commons Attribution (CC BY 4.0)
URL:
https://creativecommons.org/licenses/by/4.0/
via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/