(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .



Evolutionary dynamics of dengue virus in India [1]

['Suraj Jagtap', 'Department Of Chemical Engineering', 'Indian Institute Of Science', 'Bengaluru', 'Karnataka', 'Chitra Pattabiraman', 'Infectious Disease Research Foundation', 'Arun Sankaradoss', 'National Centre For Biological Sciences', 'Tata Institute Of Fundamental Research']

Date: 2023-04

More than a hundred thousand dengue cases are diagnosed in India annually, and about half of the country’s population carries dengue virus-specific antibodies. Dengue propagates and adapts to the selection pressures imposed by a multitude of factors that can lead to the emergence of new variants. Yet, there has been no systematic analysis of the evolution of the dengue virus in the country. Here, we present a comprehensive analysis of all DENV gene sequences collected between 1956 and 2018 from India. We examine the spatio-temporal dynamics of India-specific genotypes, their evolutionary relationship with global and local dengue virus strains, interserotype dynamics and their divergence from the vaccine strains. Our analysis highlights the co-circulation of all DENV serotypes in India with cyclical outbreaks every 3–4 years. Since 2000, genotype III of DENV-1, cosmopolitan genotype of DENV-2, genotype III of DENV-3 and genotype I of DENV-4 have been dominating across the country. Substitution rates are comparable across the serotypes, suggesting a lack of serotype-specific evolutionary divergence. Yet, the envelope (E) protein displays strong signatures of evolution under immune selection. Apart from drifting away from its ancestors and other contemporary serotypes in general, we find evidence for recurring interserotype drift towards each other, suggesting selection via cross-reactive antibody-dependent enhancement. We identify the emergence of the highly divergent DENV-4-Id lineage in South India, which has acquired half of all E gene mutations in the antigenic sites. Moreover, the DENV-4-Id is drifting towards DENV-1 and DENV-3 clades, suggesting the role of cross-reactive antibodies in its evolution. Due to the regional restriction of the Indian genotypes and immunity-driven virus evolution in the country, ~50% of all E gene differences with the current vaccines are focused on the antigenic sites. Our study shows how the dengue virus evolution in India is being shaped in complex ways.

Dengue is a mosquito-borne disease with four closely related serotypes of the virus (DENV-1-4). Further, cross-reacting dengue antibodies from a previous infecting dengue serotype can protect or enhance infection from other serotypes. This can force the emergence of new dengue variants that find ways to escape the immune action or take advantage of it. In endemic countries like India, high rates of previous dengue infection can drive the evolution of dengue serotypes in complex ways. We compare all published dengue virus sequences to understand how new variants of dengue are emerging in India. Dengue cases and corresponding viruses display triennial surges. Further, the dengue envelope protein for each serotype shows recurring divergence and reversal towards its ancestral strain over a three-year window. Such fluctuations are also correlated among the dengue serotypes in India and could arise from the changing levels of cross-reactive antibodies. This, combined with the regional exchange of the virus among Asia-Pacific countries, has led to the emergence of India-specific DENV lineages, including a new DENV-4 (Id) variant. This has also contributed to significant variations in the epitope regions of the current dengue viruses in India compared to the vaccines with implications for their efficacy.

Our group has recently published 119 whole-genome dengue sequences from clinical samples across four different sites in India from 2012 to 2018 [ 39 ]. This effort has substantially increased the number of whole-genome dengue sequences from India (from 65 to 184 genomes) that now allows careful examination of the evolutionary dynamics of the dengue virus in the country. In this study, we compiled all available whole dengue genomes (n = 184) and E gene (n = 408) sequences to generate the most comprehensive dataset of dengue virus sequences to date from India. We performed spatio-temporal analysis using these sequences to understand the spread of the dengue virus. We also analysed how the virus evolves from its ancestors and in the presence of multiple serotypes. Our work highlights how dengue virus is evolving in India.

Large divergence of prevalent dengue genotypes can have significant implications for vaccine design and development [ 34 , 35 ]. Multiple dengue vaccines targeting all four serotypes have been developed and are currently at different stages of clinical trials [ 36 ]. These vaccines are based on the old dengue isolates from outside South Asia. In the absence of efficacy studies in India, it remains unclear whether they will induce optimal levels of neutralizing antibodies against the dengue viruses circulating in India. Apart from not providing sufficient protection against dengue infection, some vaccine candidates can even lead to enhanced disease through ADE upon subsequent infection, as seen in the case of the CYD-TDV vaccine [ 37 , 38 ]. Yet, differences at the antigenic sites between vaccines and prevalent Indian dengue strains have not been investigated to date.

In spite of being a hotspot of dengue infections, the scarcity of dengue genomic data from India has limited our understanding of dengue virus evolution. Previous dengue studies in India have focused on regional outbreaks [ 7 – 9 , 31 – 33 ], which are dominated by single or closely related strains due to the limitations of a short collection period. The persistence of all serotypes in a high seroprevalence background has been shown to manifest in immunity-driven co-evolution of dengue virus strains at the city-scale [ 11 ]. In the absence of longitudinal analysis of dengue viral diversity, it remains unknown, whether such selection pressures are shaping dengue virus divergence at a country-wide scale in India.

Complex population immunity against the dengue virus can also modulate the levels of annual infections and caseload. Cyclic dengue outbreaks in the endemic regions occur every 2–4 years, often associated with serotype/genotype replacement, where serotype/genotype dominance changes during the subsequent outbreak [ 18 – 21 ]. This has been attributed to a combination of long-term protection from homotypic dengue infection albeit short-term protection (up to 2 years) from the heterotypic secondary infection [ 22 – 24 ]. This allows the heterotypic serotype to manifest an outbreak depending on the prevalence of serotypes and population immunity in the region [ 18 ]. During serotype replacement, ADE can also play a role in increasing dengue cases. Increased dengue viremia levels observed during ADE [ 25 – 27 ] can contribute to increased transmission of the virus [ 28 , 29 ]. Therefore, ADE related effects can further make the cyclic pattern of outbreaks more prominent. However, whether this advantage also leads to the evolution of the virus, remains unknown. Therefore, knowledge of longitudinal prevalence, serotype distributions, and prior serotype of infection can help us in understanding the evolution of the dengue virus and predicting future outbreaks [ 30 ].

Global dengue virus evolution is modulated by pathogen transmission bottlenecks and immunological pressures [ 10 , 11 ]. An increase in globalization and human mobility can lead to the global spread of the emerging dengue virus strains. On the other hand, the acute nature of the disease, limited travel range and restriction to tropical regions of the vector can constrain the virus spread. Like other vector-borne virus infections, the dengue virus switches its environment due to horizontal transmission, exerting a strong purifying selection pressure [ 10 , 12 , 13 ]. In the human host as well as at the larger population scale, pre-existing immunity can contribute to the emergence of immune escape variants. Heterotypic immunity can also shape the co-evolution of dengue serotypes contingent on the level of cross-reactive antibodies and the antigenic similarity between the infecting serotypes during primary and secondary infection [ 11 , 14 ]. Further, antibody-dependent enhancement (ADE) under sub-optimal levels of cross-reactive antibodies can confer a selective advantage to antigenically related serotypes [ 15 – 17 ].

Dengue infections have increased dramatically in the last two decades and are expected to rise further as they spread to newer regions fuelled by urbanization and travel [ 1 ]. About half of the world population is at risk of dengue infections [ 1 , 2 ]. Estimates from 2010 claim 390 million annual dengue infections worldwide, of which only 96 million cases were reported clinically [ 3 ]. Around one-third of these infections were estimated to be from India [ 3 ], though most of them go unreported [ 4 ]. Dengue is endemic in almost all states in India [ 5 , 6 ]. All four antigenically distinct serotypes (DENV-1-DENV-4) of the virus that display significant immunological cross-reactivity due to 65–70% homology have been reported from various parts of the country [ 7 – 9 ]. Combined with a complex transmission cycle and high dengue seroprevalence [ 5 ], dengue evolution in the country has been shaped in complex and unexpected ways though this remains poorly understood.

Homo-dimeric structures of the envelope proteins were obtained by homology modelling using the SWISS-MODEL server [ 53 ]. Template PDB structures were selected based on global model quality estimates and QMEAN statistics. Vaccine strain information and template selection for each serotype is shown in the S3 Table . Sites with differences in >10% of sequences were identified as significant differences and were mapped onto the envelope protein structure using PyMOL (v2.4.1) [ 54 ].

Experimentally determined epitopes for all dengue serotypes were obtained from the Immunome Browser tool available on the Immune Epitope Database [ 50 ]. B cell and T cell epitopes were selected based on their response frequency (RF) score, calculated as reported earlier [ 51 ]. Epitopes having RF-score > 0.25 were selected as dominant epitopes for further analysis. For the epitopes examined in multiple studies, the RF-score was calculated by combining the number of subjects from all the studies. Apart from these known epitopes, we also included 77 sites in which residue variation conferred antigenic effects [ 52 ].

Pearson’s correlation coefficient was obtained between the serotype and inter-serotype distance dynamics. The robustness of the correlations was checked by three methods. First, by randomly deleting up to 3 data points from each dynamics (1000 bootstraps). Second, by estimating the cross-correlation between individual traces from two comparing groups randomly with 1000 bootstraps. As a control, we also checked correlations between the groups by randomly shuffling the time series of normalized hamming distances (before correlating) to ensure that the correlations do not arise merely from the yearly fluctuations (1000 bootstraps).

Temporal dynamics of E gene amino acid variation was examined in the sequences from South India due to the availability of a larger dataset (n = 164). Sequences with at least 50% coverage (mean coverage of 93%) of the E gene were selected for the analysis. The Hamming distance between the pair of sequences was divided by the length of the overlapping region between the pair. This was further converted to the z-score using the mean and standard deviation to obtain the normalized distance. To extract the dynamics within the serotype, we measured the normalized distance of new variants over the years with respect to the ancestral sequences (sequences from 2007/08 were used as ancestors due to a lack of sufficient sequences before that). The inter-serotype distance was calculated by selecting the sequences from two different serotypes during a particular year. Sample bootstrapping (n = 100) was used to determine the median and standard error of the normalized distances for each year. The time period of oscillations was calculated using an autocorrelation function with different lags for each trace. A lag with the maximum correlation coefficient was assigned as the period of oscillation for that trace. Bootstrap replicates (n = 100) were used to obtain the distribution of the time period. We performed the same analysis with the CprM gene sequences (position 55 to 155 of the coding sequence) from South Indian sequences (n = 411).

Bayesian analysis was performed with the E gene, NS5 gene and whole-genome sequences from dataset B.1 using the BEAST v1.8.3 [ 47 ] to get the substitution rates for each gene. Five sets were generated for each serotype by selecting 80% sequences randomly, the substitution rate was calculated for each run, and the average substitution rate was reported. The constant rate clock model was selected based on the AICM values of the Bayes factor and harmonic means ( S2 Table ). Markov Chain Monte Carlo (MCMC) was run for 10 7 generations for each run, and the first 10% of samples were discarded as burn-in. The phylogeographic movement of the virus across the countries was obtained using the Bayesian stochastic search variable selection procedure implemented in BEAST v1.8.3 with whole genome sequences from dataset B.1 (MCMC chain length ~10 8 ). SpreaD3 v0.9.7 [ 48 ] was used to visualize the spatio-temporal dynamics of the virus. Tracer v1.6 was used to check the convergence of the chains. Effective sample sizes for the parameters of interest were greater than 200.

Multiple sequence alignment was performed with protein-coding sequences from dataset B using MUSCLE v3.8.425 [ 42 ] implemented in AliView v1.25 [ 43 ]. Alignments were manually checked for insertion and deletion errors. Maximum likelihood trees were generated using IQ-TREE v1.6.10 [ 44 ] with 1000 bootstraps. A general time-reversible substitution model with unequal base frequency and gamma distribution for rate heterogeneity (GTR+F+I+G4) was selected out of 88 models available using jModelTest [ 45 ] implemented in IQ-TREE v1.6.10 [ 44 ] based on the Bayesian information criterion. Sylvatic strains EF457905 (DENV-1), EF105379 (DENV-2) and JF262779-80 (DENV-4) were used as outgroups to root the respective trees. The root for DENV-3 was obtained using the best fitting root by the correlation method in TempEst [ 46 ]. The maximum likelihood phylogenetic trees were used to obtain the root-to-tip distances using TempEst. Trees were visualized using Figtree v1.4. We assigned new lineages within a genotype if the difference between the sequences from each phylogenetic branch is more than 3% at the nucleotide level and 1% at the amino acid level.

State-wise number of dengue cases and deaths over the period 2001 to 2022 was retrieved from https://www.indiastat.com/ . Peaks in the number of cases and deaths were assigned for a particular year if the number of cases/deaths in that year increased more than 50% from the previous year. Due to the COVID-19-related disruptions from 2020 onwards, the cases from 2019–2022 were excluded for determining the peaks [ 41 ].

Global dengue protein-coding sequence records that contain sample collection dates were obtained from the ViPR database [ 40 ]. After removal of identical sequences, this dataset included DENV-1 (n = 1800) from 1944–2018, DENV-2 (n = 1395) from 1944–2018, DENV-3 (n = 823) from 1956–2018, DENV-4 (n = 220) from 1956–2018, representing a total of 4238 protein-coding sequences. Dataset B.1: For analysis of spatio-temporal dynamics, protein-coding sequences specific to the Indian genotypes were selected from dataset B. We included all unique protein-coding sequences from India and augmented the set with 362 randomly sampled global sequences belonging to the Indian genotypes ( S1 Table ). This included a total of 522 sequences from DENV-1 genotype I (n = 142), DENV-1 genotype III (n = 93), DENV-2 genotype Cosmopolitan (n = 144), DENV-3 genotype III (n = 96), and DENV-4 genotype I (n = 47).

All the published Indian dengue sequences were obtained from the ViPR database [ 40 ]. These sequences include both whole-genome sequences and gene segments. Only sequences with location and collection date were used for the analysis (about 88.9% of all sequences). Samples for these sequences were collected between 1956 and 2018 and represent DENV-1 (n = 840), DENV-2 (n = 877), DENV-3 (n = 746) and DENV-4 (n = 179) serotypes.

Results

Dynamics of E gene evolution displays recurring variation Taking a cue from the divergence of DENV-4-Id, we examined whether the high seroprevalence can play a role in the evolution of dengue in India. Dengue cross-reactive immunity has been shown to shape the antigenic evolution of dengue for the E gene at a city level [11]. Since serotype replacements in the different regions of the country display temporal fluctuations (Fig 1C), we asked whether there are signatures of immunity-driven evolution of dengue serotypes across large endemic areas. In the absence of antigenicity data, we evaluated the variation in amino acid sequences of the E gene longitudinally. To validate this approach, we rely on neutralization antibody titers of sera collected from humans (post-monovalent vaccination) and African green monkeys (post-infection) tested against a panel of viruses [72]. We found general correlation between the antibody titers and the similarity between the E protein sequences with the two datasets of antibody titer data (Pearson’s correlation coefficient: 0.530, S9 Fig). In South India, we find that, in general, the E gene diverges from the ancestral sequence for all serotypes, but this divergence fluctuates over time (Fig 4A). Overall, in our dataset, the E gene sequences drift away from their respective ancestral sequences, evolve to be similar and then diverge repeatedly. This behaviour was pronounced in DENV-2-4 with an estimated time period (peak-to-peak) of about three years (2.92 ± 0.58 years for DENV-2, 3.55 ± 0.9 years for DENV-3 and 2.99 ± 0.64 years for DENV-4). While we could not estimate a time period for a similar E gene dynamic in the case of DENV-1, we did note a peak in divergence in 2012–13 arising from a singular genotype I outbreak during a period of genotype III prevalence. PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 4. Dynamics of E gene amino acid variation in South India between 2007 and 2016. (A) Normalized amino acid distances with respect to the 2007/2008 dengue strains is plotted in black. Hamming distances were normalized with the sequence length and converted to a z-score. Distances were bootstrap sampled (n = 100) to calculate the reported median. Error bars represent standard error. The blue line represents a linear regression fit to the data with 80% (gray shade) confidence interval. D i indicates the distance dynamics between DENV-i from its (2007/2008) ancestral sequence. (B) Year-wise normalised interserotype distances (as z-score) calculated by randomly selecting the sequences from each serotype every year (100 bootstraps) is plotted. The inset heatmap depicts the time period of oscillation binned yearly (range of 0 to 5 years). The mean and standard deviation (in brackets) of the time period distribution is denoted next to the heatmap. D ij indicates the interserotype distance dynamics between i-th and j-th serotype. (C) Median of the Pearson’s correlation coefficients between the distance dynamics D i and D ij is shown as a heatmap with corresponding colormap. A schematic depicting the relative evolution of dengue serotypes in two-dimensional sequence space with positive and negative correlation between D ij and D i or D j is shown at the bottom. Central point represents the ancestral strain, and the arrow represents the direction of the virus evolution. https://doi.org/10.1371/journal.ppat.1010862.g004 Interestingly, the observed dynamics of the E gene amino acid variations were also correlated among the serotypes (Figs 4C and S10). The high correlation suggests that the divergence in each serotype was synchronous, i.e., when DENV-1 drift away (or converges) from its ancestral sequences, a similar dynamics is observed for DENV-2-4 with respect to their ancestral sequences (Figs 4C and S10). To understand whether the evolution of the dengue virus is shaped by the interserotype cross-reactive immunity in the population, we examined the dynamics of interserotype distance D ij (year-wise hamming distance between DENV-i and DENV-j) (Fig 4B). We observe that the interserotype distances also displayed fluctuations over a similar time period of 2–4 years (2.54 ± 0.5 years for D 12 , 4.13 ± 1.18 years for D 13 and 3.11 ± 0.58 years for D 23 ), suggesting an interplay between the serotypes at the population level. We also checked whether the serotype sequences became similar to each other or drifted apart based on the correlations between the serotype and interserotype evolutionary dynamics. For instance, when DENV-1 and DENV-2 display divergence (or convergence) with respect to their ancestral sequences (Fig 4A), the distance between the DENV-1 and DENV-2 (D 12 ) also increase (or decrease). While DENV-1 dynamics did not correlate with D 13 and D 14 dynamics, DENV-2 strongly correlated with all other interserotypic dynamics. In contrast, DENV-3 showed a negative correlation with D 13 and D 34 dynamics suggesting that as DENV-3 diverges, it moves closer to the circulating DENV-1 and DENV-4 strains (Figs 4C and S10). These signatures are specific to the immunologically dominant E gene. For example, when we compare the CprM gene sequences from South India, the fluctuations were not significant. When CprM Hamming distance fluctuations are observed as in the case of D 1 dynamics, it is linked to the transient introduction of genotype I in 2012 (S11A Fig). Consistent with this, we found no significant correlations between the within or inter-serotypic viral dynamics, except for DENV-4 (S11B and S11C Fig). Due to the availability of only 21 sequences for DENV-4 (and n = 3 between 2008–2015), we do not consider these correlations as significant. We interpret these coupled fluctuations between dengue serotypes in light of population-level cross-reactive immunity and antibody-dependent enhancement. When homotypic immunity is present in the population, the serotypes drift apart, manifesting in positive correlations between serotype divergence and interserotype dynamics, as observed for DENV-1 and DENV-2 (Fig 4C). However, when population immunity is poor against a particular serotype (as with a new introduction or waning levels), similarity to that serotype confers an advantage due to the presence of cross-reactive antibodies and associated ADE [14,15]. This is consistent with the negative correlation observed between DENV-3 and interserotype dynamics D 13 and D 34 with dropping DENV-3 prevalence and its replacement by DENV-1 and DENV-4 in South India (Figs 4C and 1B). Indeed, recent reports have argued the interserotypic convergence of dengue antigenicity to be correlated to the outbreaks [11].

[END]
---
[1] Url: https://journals.plos.org/plospathogens/article?id=10.1371/journal.ppat.1010862

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/