(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .
Using viral diversity to identify HIV-1 variants under HLA-dependent selection in a systematic viral genome-wide screen [1]
['Nadia Neuner-Jehle', 'Department Of Infectious Diseases', 'Hospital Epidemiology', 'University Hospital Zurich', 'Zurich', 'Institute Of Medical Virology', 'University Of Zurich', 'Marius Zeeb', 'Christian W. Thorball', 'Precision Medicine Unit']
Date: 2024-09
The pathogenesis of HIV-1 infection is governed by a highly dynamic, time-dependent interaction between the host and the viral genome. In this study, we developed a novel systematic approach to assess the host-virus interaction, using average pairwise viral diversity as a proxy for time since infection, and applied this method to nearly whole viral genome sequences (n = 4,464), human leukocyte antigen (HLA) genotyping data (n = 1,044), and viral RNA load (VL) measurements during the untreated chronic phase (n = 829) of Swiss HIV Cohort Study participants. Our systematic genome-wide screen revealed for 98 HLA/viral-variant pairs a signature of immune-driven selection in the form of an HLA-dependent effect of infection time on the presence of HIV amino acid variants. Of these pairs, 12 were found to have an effect on VL. Furthermore, 28/58 pairs were validated by time-to-event analyses and 48/92 by computational HLA-epitope predictions. Our diversity-based approach allows a powerful and systematic investigation of the interaction between the virus and cellular immunity, revealing a notable subset of such interaction effects. From an evolutionary perspective, these observations underscore the complexity of HLA-mediated selection pressures on the virus that shape viral evolution and pathogenesis.
The intricate interplay between viruses and the human immune system is reflected in dynamic associations between the viral and the human genomes. These often take the form of escape dynamics, in which the virus acquires mutations that allow it to evade immune recognition. We developed a novel viral diversity-based method to screen for such interactions across the viral genome systematically and applied it to a unique dataset of HIV-1 sequences and human leukocyte antigen (HLA) variants. We could identify time-dependent interactions between 98 pairs of HLA and viral variants. Among these pairs, 12 were associated with the concentration of viral RNA, longitudinal time-to-event analyses confirmed 28, and 48 were consistent with computational predictions of viral peptide binding to HLA molecules. Our results highlight how the highly dynamic interaction between the viral genome and the immune system shapes viral evolution, and our approach offers new opportunities to systematically study such interactions from real-world cross-sectional data.
Competing interests: K.J.M. has received travel grants and honoraria from Gilead Sciences, Roche Diagnostics, GlaxoSmithKline, Merck Sharp & Dohme, Bristol-Myers Squibb, ViiV, and Abbott; and the University of Zurich has received research grants from Gilead Science, Novartis, Roche, and Merck Sharp & Dohme for studies in which K.J.M. serves as principal investigator, and advisory board honoraria from Gilead Sciences and ViiV. A.R. reports support to his institution for advisory board and/or travel grants from MSD, Gilead Sciences, Pfizer, and Moderna, and an investigator-initiated trial (IIT) grant from Gilead Sciences. All honoraria went to his home institution, not to A.R. personally, and all honoraria were provided outside of the submitted work. M.S. reports advisory board consultations from Gilead, ViiV, MSD, paid to his institution, and travel grants for conferences from Gilead, paid to his institution. E.B.’s institution has received research grants from Gilead and Merck unrelated to this work; E.B.’s institution has also received consultancy fees and travel grants from Gilead, ViiV, Merck, Pfizer, Astra Zeneca, Moderna, Abbvie, and Ely Lilly. J.N. has received travel grants from Gilead. H.F.G. has received grants from the Yvonne-Jacob Foundation, the Clinical Research Priority Program of the University of Zurich, and Gilead Sciences, and, outside of this study, grants for unrestricted research from the Swiss HIV Cohort Study, the Swiss National Science Foundation, the National Institutes of Health, the Bill and Melinda Gates Foundation, Gilead, and ViiV; and personal fees as a consultant for Merck, ViiV Healthcare, and Gilead Sciences and as a member of the Data and Safety Monitoring Board for Merck. H.F.G.’s institution has received educational grants unrelated to this work from Gilead, ViiV, MSD, Abbvie, Pfizer, and Sandoz. C.P. has received fellowships from the Collegium Helveticum. R.D.K. has received grants from Gilead Sciences, the National Institutes of Health, the Swiss National Science Foundation, and the Swiss HIV Cohort Study. N.N.J. has received support from the Swiss National Science Foundation and the Swiss HIV Cohort Study. All other authors report no potential conflicts of interest.
Funding: This study has been financed within the framework of the Swiss HIV Cohort Study (SHCS), which is supported by the Swiss National Science Foundation (SNSF; grant #201369), by SHCS project #910, and by the SHCS research foundation. The data are gathered by the Five Swiss University Hospitals, two Cantonal Hospitals, 15 affiliated hospitals, and 36 private physicians (listed in
http://www.shcs.ch/180-health-care-providers ). This work was further supported by the SNSF (grant numbers 177499 to H.F.G. in the framework of the SHCS; 179571 and 141067 to H.F.G.; and 207957 and 155851 to R.D.K.); the Yvonne-Jacob Foundation (to H.F.G.); the University of Zurich Clinical Research Priority Program for Viral Infectious Disease, the Zurich Primary HIV Infection Cohort Study (to H.F.G.); and an unrestricted research grant from Gilead Sciences (to the SHCS Research Foundation). N.N.-J.'s salary was financed in part by grants from the SNSF (207957) and the SHCS (project #910). C.P. was supported by a fellowship from the Collegium Helveticum. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Data Availability: The data generated or analyzed during the current study cannot be shared publicly due to the sensitive nature and privacy concerns (see
https://www.shcs.ch/294-open-data-statement-shcs ). Investigators with a request for selected data can send a proposal to the Swiss HIV Cohort Study ( www.shcs.ch/contact ). The provision of data will be evaluated by the Scientific Board of the Swiss HIV Cohort Study and the study team and will be subject to Swiss legal and ethical regulations. The main coding scripts for this study will be made available in a public GitHub repository (under
https://github.com/nneune/HLA_APD ).
Based on comprehensive data from the Swiss HIV Cohort Study (SHCS) [ 21 ], we applied a previously established and validated viral genetic diversity score (average pairwise diversity, APD [ 22 , 23 ]) as a reliable proxy for time since infection (TSI). Determining the TSI can be challenging as people may live with HIV for many years before being diagnosed [ 22 , 23 ]. HIV diversity increases throughout the infection, and this diversity can be used to derive the TSI. Of several methods that utilize viral diversity, APD was characterized by a mean absolute error of less than one year and higher sensitivity and specificity than, for example, the fraction of ambiguous nucleotides, FAN, method [ 22 , 23 ]. We used APD to systematically assess the time-dependence of HLA/HIV-variant associations. This has allowed us to gain novel insights into the evolving relationship between the human immune system and the ever-adapting HIV.
However, as most of these interactions were found as associations between HIV mutations and HLA alleles in cross-sectional studies, their temporal dynamics remain unclear. Previous longitudinal studies described HIV escape mutations occurring within three years after infection. In the presence of a “protective” HLA allele, mutations occurred more rapidly, and in its absence, they reverted quicker upon transmission [ 16 – 20 ]. These studies, however, focused on specific viral epitopes in small populations, therefore lacking the systematic approach of the genome-wide screens. We developed a novel viral diversity-based approach that allows us to investigate the time-dependent effects of HLA-associated HIV adaptation from cross-sectionally sampled viral genomes.
The rapid evolution and high diversity of HIV pose challenges to the development of universal vaccines against the virus. Also, HIV disease progression varies significantly among individuals and has been associated with host factors [ 1 – 3 ], and viral evolution [ 4 ]. The human genome, specifically the human leukocyte antigen (HLA) genes which encode the major histocompatibility complex (MHC), has emerged as a consistent predictor of HIV disease progression [ 3 , 5 – 9 ]. HLA class Ι genes (A, B, and C) have been associated with specific HIV mutations that enable viral immune escape by disrupting antigen processing, epitope-MHC binding, or reducing antigen recognition by T cell receptors [ 6 , 9 ]. These mutations provide the virus a selective advantage in the presence of a matching HLA allele but may be associated with reduced fitness in its absence [ 9 ]. Regarding HLA class ΙΙ, the current evidence is conflicting: studies have computationally predicted CD4 + T-cell-driven escape [ 10 ], but this could not be confirmed in vitro [ 11 ]. Another study showed that linkage disequilibrium of HLA alleles may explain associations with class ΙΙ [ 12 ].
As statistical analysis Ιb was restricted to HIV-1 subtype B and included proviral sequences, we tested in a sensitivity analysis the impact of subtypes and sequence origin on the HLA-APD interaction for the 98 identified pairs. When excluding the 61 proviral DNA sequences, we observed almost identical results (ρ spearman = 0.96, p<0.001; S5D Fig ). Whilst inferring HLA-APD interaction effects for 320 HIV-1 non-B subtype sequences (see S3 Table ), we observed a moderate correlation with the effects for subtype B (ρ = 0.34, p = 0.005; S5E Fig ) and saw that five pairs retained significance, reflecting potential subtype differences, and limited statistical power for non-B subtypes. When the sequences of all subtypes (B and non-B) were pooled, a strong correlation was observed with the effects for subtype B (ρ = 0.88, p<0.001; S5F Fig ). In the pooled analysis, the majority of associations (80 of 98) from the subtype B-only analysis (Ιb) were recovered. All 80 pairs exhibited significant HLA-APD interaction effects even after adjusting for the first ten viral PCs, thereby accounting for viral population structure. Lastly, HLA alleles associated with the same viral variant were examined in multivariable logistic regressions to test for the effect of linkage disequilibrium among HLA alleles. In these analyses, none of the class ΙΙ HLA alleles retained their significant interaction with APD ( S6 Fig ).
A) Log odds ratio (OR) of the Average Pairwise Diversity (APD)-HLA interaction versus the Log Ratio of EL Rank of mutation/EL Rank of consensus. Upper-right section (yellow) indicates expected major histocompatibility complex (MHC) escape mutations. Labels describe HLA/HIV-variant pairs of interest (same as in Fig 1B ) and the epitope position of mutation (ranging from 1 to 9). B, C) Peptides are 9-mer HIV sequences, including the mutation (dark color; Pol432R/ RT277R (B) or Rev57E (C)) or consensus (light color; Pol432K/ RT277K (B) or Rev57G (C)) at different positions (ranging from 1 to 9). The rank was categorized into strong binder (≤0.5%), weak binder (≤2.0%), and non-binder (>2.0%). Non-binding positions are omitted here. All binding prediction computations are derived from NetMHCpan-4.1.
We used computational MHC class Ι binding prediction, to assess whether the identified HLA-allele/viral-variants pairs could be explained mechanistically. Among the 92 pairs tested (excluding MHC class ΙΙ, n = 6), 66 pairs (98 epitopes) were predicted to exhibit at least "weak" binding of at least one non-mutated or mutated 9-mer epitope to the corresponding HLA allele (see S2 Table for more information on epitopes). For these, the impact of the mutation on predicted binding was quantified as rank ratios between mutation and consensus (non-mutated) sequence ( Fig 5A ). Compared to the HLA-APD interaction effects inferred in the analysis Ι, 48/66 pairs exhibited epitopes with effects in the same direction as the interaction effects, i.e., weaker binding upon mutation for escape mutations (42/48) and vice versa for mutations with a negative HLA-APD interaction effect (6/48), and 18 pairs had epitopes with only divergent effects, i.e., stronger binding upon mutation despite escape mutation characteristic (Figs 5A and S5C ). Overall, there was a weak correlation between the APD-HLA interaction effects and rank ratios (ρ spearman = 0.11, p = 0.38; S5C Fig ). For 56/98 epitopes (41/66 pairs), the mutation did not change the binding category (weak/strong binding) corresponding to the binding ranks ( S4 Fig ). For 24 (22 pairs) the binding category was predicted weaker upon mutation, e.g., Pol432R:A*03:01 position 9 (still strong binder; Fig 5B ), and for 18 epitopes (17 pairs) the binding category was predicted stronger upon mutation, e.g., Rev57E:B*40:01 position 2 ( Fig 5C ).
Two examples illustrate the range of patterns ( Fig 4B and 4C ): Pol-432R developed quickly with an incidence rate of 32.3/100 person-years in participants with HLA-A*03:01, whereas the mutation occurred more slowly in participants with another HLA-A allele (4.75/100 person-years; HR 6.75, 95% CI [1.74, 26.19], p = 0.006) ( Fig 4B ). On the other hand, Rev57E arose more quickly in the absence of HLA-B*40:01 and not at all in the presence of the allele ( Fig 4C ).
A) Log odds ratios (OR) of interaction effects between APD and HLA on the viral variant versus the Log Hazard Ratio of the respective pair. Significant HLA/HIV-variant interactions (p<0.05) are represented by whole and non-significant (n.s.) by empty symbols. HLA/HIV-variant pairs of interest (same as in Fig 1B ) are highlighted. B, C) Cumulative Hazards of acquiring HIV variant Pol432R (B) or Rev57E (C) in presence (dark blue (B) or dark red (C)) of respective HLA allele versus its absence (light blue (B) or light red (C)), are indicated by the respective lines, confidence intervals are illustrated by the shaded area. Censored events are displayed as vertical lines. P-values are shown as log-rank. “Number at risk” indicates the number of participants with outstanding events after the specified time in years.
Using longitudinally sampled HIV sequences, we evaluated the impact of HLA alleles on the hazards of HIV variant occurrence in analysis ΙΙΙ. We excluded 20 pairs of the 98 previously identified HLA/HIV-variant pairs with a significant HLA-APD interaction ( Fig 2B and 3D ), that had less than two events (i.e. incident mutation) and 20 pairs that did not have at least one event per exposure group (presence/absence of HLA allele). Cox-proportional hazards-regression models revealed significant differences in the hazard of acquiring the viral variant over time in the presence or absence of the HLA allele for 28 of the remaining 58 pairs (hazard ratio [HR] IQR [7.06, 16.99]). The median time to mutation was shorter in the presence of HLA (median [IQR]: 1.36 [1.04, 1.65] years) compared to the participants without the HLA allele (median [IQR]: 1.44 [1.29, 1.48] years). Overall, the correlation between the APD-HLA interaction ORs and the HR of the longitudinal analysis was moderate (ρ spearman = 0.48, p<0.001; S5B Fig ), and the directionality of the significant HRs was consistent with the directionality of the ORs ( Fig 4A ).
A) Multivariate linear regressions of eight HLA/HIV-variant pairs of interest (same as in Fig 1B ), depicted as their association and interaction effect on viral load (VL). B, C) VL distribution over four groups for pair Pol432R:A*03:01 (B) and Rev57E:B*40:01 (C). The absence of human leukocyte antigen (HLA) or HIV variant is coded as ‘0’ and the presence as ‘1’. Each participant is represented by one point (mean log10 VL), and violin plots show distribution, and boxplots show median and IQR of the distribution. Coloration according to panel A. D, E) Multiple linear regression estimates of HIV amino acid (AA) variant on VL (D) or HLA allele on VL (E) and HLA/HIV-variant interaction on VL plotted for Gag242N:B*57:01 and all 98 HLA/HIV-variant pairs with significant average pairwise diversity (APD)-HLA interactions (analysis Ιβ). The 12 HLA/HIV-variant pairs with a significant interaction effect on viral load (p<0.05) are represented by whole symbols and all non-significant (n.s.) by empty symbols. Shapes indicate HIV protein. Coloration according to panel A. Upper-left section in E (yellow) indicates expected HLA escape mutations.
In linear regression models, HLA-B*57:01 and B*57:03 were highly associated with lower VL (p<0.001, S3 Fig ) and showed significant interaction effects on VL with viral variants of the proteins Gag, Nef, Pol, and Vif. Assessing those effects among all 98 pairs with significant interaction effects between APD and HLA on the viral variant in analysis ΙΙ, we found 12 HLA/HIV-variant pairs in total with a significant interaction effect on VL (IQR regression coefficient (β) [0.43,0.61]) (Figs 3D and 3E and S2 Table ). All but one of the twelve pairs showed a positive interaction effect on VL, i.e. the VL was higher in HLA-carriers with the mutation and lower in HLA-carriers without the mutation. For instance, those participants with the HLA-A*03:01 allele had a significantly lower VL than those without it (β = -0.45, 95% CI [-0.81,-0.10], p = 0.013). However, when Pol432R was acquired, VL increased again in HLA-A*03:01-carriers (β = 0.60, 95% CI [0.22, 0.98], p = 0.002) ( Fig 3A and 3B ). Interestingly, the Gag242N:B*57:01 pair exhibited no HLA-APD interaction effect, yet demonstrated increased VL levels upon mutation, following the trend of the other eleven pairs with positive interactions ( Fig 3A and 3E ). The pair Rev57E:B*40:01, with a negative interaction effect on VL, showed increased VL levels in the presence of HLA-B*40:01 (β = 0.31, 95% CI [0.10, 0.52], p = 0.003), but upon mutation Rev57E, the VL decreased again (β = -0.53, 95% CI [-1.00, -0.06], p = 0.027) ( Fig 3A and 3C ). Overall, we saw that the interaction effects of HLA and APD on the HIV variant corresponded with the HLA/HIV-variant interaction effects on VL (ρ spearman = 0.94, p<0.001 for significant HIV-HLA interactions; S5A Fig ).
Notably, of the remaining 14/98 pairs with significant HLA-APD interactions, four remained stable in absence of the HLA allele, and ten pairs even had negative interaction terms ( Fig 2D and S2 Table ). An example for this negative HLA/HIV-variant association and APD-HLA interaction effect is the variant Rev57E, as it was more evident over time in the absence of HLA-B*40:01 ( Fig 2B and 2C ). For 335 pairs, a significant HLA/HIV-variant association but no significant interaction with APD was estimated. For example, the associated pair Gag242N:B*57:01 ( Fig 2B ) showed no interaction with APD, suggesting a robust and early selection of Gag242N in HLA-B*57:01-carriers.
A) Shown are Gag242N:B*57:01 and 98 HLA/HIV-variant pairs with significant average pairwise diversity (APD)-HLA interactions (analysis Ιβ). Odds ratios (OR), sorted by effect size and HIV protein, are derived from Fisher’s exact tests, and colored by HLA genes (class Ι gray, class ΙΙ red). Pairs of interest are highlighted (bold label, triangle symbol). B) Multivariable logistic regressions of eight HLA/HIV-variant pairs of interest are shown. The effect of HLA allele, APD, and their interaction on the occurrence of the viral variant is presented as odds ratios with 95% confidence intervals (OR [CI]). Sample sizes are indicated (n). C) Distribution of the pairs Pol432R:A*03:01 (blue) and Rev57E:B*40:01 (red) within the study population and the likelihood of observing the variant in the presence (dark color) or absence (light color) of the HLA allele over APD is shown. Labels indicate the sample sizes for each group. Dashed lines represent the fitted values of linear regressions, while plain lines show data with standard error. D) Effect of APD on presence of viral variant (estimated OR from analysis Ιβ) in presence/absence of HLA allele. Coloration according to panel B. Positive interaction effects in yellow quadrant (= expected HLA escape mutation). *All estimates exceeding the third quartile+1.5 were adjusted to this threshold. Standardized APD used in panel B and D.
To assess the time dependence of HLA selection on HIV variants, we evaluated the interaction between HLA and APD on the odds of observing the viral variant. Among the 1,044 SHCS participants included in statistical analysis Ιb, we found 98 significant HLA/HIV-variant pairs among all 433 pairs with significant associations in Fisher’s exact tests ( Fig 2A and S2 Table ). Most of these pairs (84/98) had a positive interaction effect between HLA allele and APD (OR IQR [1.71,2.28]), i.e., a decreasing chance of HIV variant occurrence over time was observed in the absence of the HLA allele and an increasing chance was observed in its presence, indicating an HLA-dependent selection pressure on the viral variant ( Fig 2D ). The amino acid variant Pol432R is an example of this larger increase over time in the presence of HLA-A*03:01. For this pair, the HLA-APD interaction term was positive (OR 3.62, 95% CI [2.09, 6.94], p<0.001) ( Fig 2B ). In the absence of HLA-A*03:01, Pol432R was less likely to occur with increasing APD (OR 0.69, 95% CI [0.58, 0.81], p<0.001), but in the presence of the allele, this trend was reversed (OR 2.50, 95% CI [1.41, 4.45], p = 0.002) ( Fig 2C ).
Most of the HIV variants from these 433 pairs were found in the proteins Pol (n = 131), Nef (n = 89), and Gag (n = 77). Among the HIV proteins considered, Vpr had the highest relative frequency of significant HLA/HIV-variant pairs (with 37 pairs/58,404 possible combinations), followed by Nef (89/165,478), Pol (131/365,496), and Gag (77/233,930). The protein Env had the lowest relative frequency (41/930’696) ( S1 Table ). Most variants (n = 365) were associated with class Ι HLA alleles (72 with HLA-A, 174 with HLA-B, and 119 with HLA-C), and the remaining 68 variants with class ΙΙ HLA alleles (5 with HLA-DPA1, 9 with HLA-DPB1, 14 with HLA-DQA1, 25 with HLA-DQB1 and 15 with HLA-DRB1) ( S2 Fig ). The strongest HLA/HIV-variant association was found for HLA-B*57:01 and Gag242N (OR = 102.67, 95% CI [47.37, 247.06], p<0.001). Five other HLA alleles (A*01:01, C*06:02, DRB1*07:01, DQA1*02:01, and DQB1*03:03) were associated with Gag242N as well (OR median [IQR]: 3.102 [2.11, 5.37]). For Pol, the strongest association was found between HLA-B*15:01 and 149L (PR 93L) (OR = 7.88, 95% CI [4.71, 13.80], p<0.001), and for Nef the strongest association was HLA-A*11:01 with 92R (OR = 16.17, 95% CI [9.53, 27.69], p<0.001).
Among the 1,528 SHCS participants included in statistical analysis Ιa, we identified 1,146,205 combinations of HLA alleles and HIV amino acid variants ( S1 Fig and S1 Table ). We employed a systematic pre-screening to narrow the selection down to HLA/HIV-variant pairs of interest, resulting in the following: First, we identified pairs that yielded a statistical power of at least 80% for detecting an HLA/HIV-variant association with an odds ratio (OR) of three (n = 208,224). Next, we selected those pairs that were associated in a Fisher’s exact test (FDR<0.2) (n = 532). Among these 532 associations between HLA alleles and HIV variants, 99 were no longer statistically significant (FDR-corrected p>0.05) when tested in multivariable logistic regression models adjusted for APD and the first ten human genome-based Principal Components (PCs) ( S1 Fig and S1 Table ). Ancestral and population structure biases likely accounted for these associations. Of the 433 remaining significant pairs, 329 showed a positive association, wherein the viral variant was more prevalent when the HLA allele was present (OR median [IQR]: 2.86 [2.19, 3.83], as shown in S2 Fig ).
A) Measurement data selection of Swiss HIV Cohort Study (SHCS) participants with human leukocyte antigen (HLA) alleles and next-generation sequencing (NGS) data for analysis Ιa & Ιb, HIV load (VL) data for analysis ΙΙ, and multiple sequences for analysis ΙΙΙ. B) HIV amino acid variants are tested in logistic regressions for their association with HLA alleles and their interaction effect between HLA alleles and average pairwise diversity (APD) (analysis Ιb). Pairs with significant interaction in analysis Ιb were further tested for their effect on VL (analysis ΙΙ), retested in a longitudinal setting (analysis ΙΙΙ), and assessed by computational HLA-epitope binding predictions. C) Pairs were selected if power>0.8 and Fisher’s exact test p-value<0.2 (FDR-corrected). Statistical analyses performed: [Ιa] Presence of HIV variants as function of HLA alleles. [Ιb] Presence of HIV variants as function of HLA alleles, APD, and interaction between HLA and APD. [ΙΙ] VL levels as function of HLA/HIV-variant pairs (identified as significantly associated in [Ιb]); and [ΙΙΙ] longitudinal survival analysis of HLA/HIV-variant pairs identified in [Ιb]. Only samples from ART-naïve participants were used for the grey-shaded analyses ([Ιb], [ΙΙ] and [ΙΙΙ]).
Among all SHCS participants who met our selection criteria ( Fig 1A and Table 1 ), 1,044 had at least one ART-naïve HIV sample sequenced and their HLA alleles genotyped (analysis Ιb), and 829 had VL measurements within 180 days around the sample date (analysis ΙΙ). For the longitudinal analysis ΙΙΙ, 130 participants had 340 sequences available. Most participants were male (86.2–90%), had “White” self-reported ethnicity (93.1–94.4%), and were of European or Northern American origin (90.0–93.8%). The time between the first HIV diagnosis and ART start varied between analysis Ιb (median [Interquartile range (IQR)]: 2.15 [0.31, 4.76] years) and analysis ΙΙ (median [IQR]: 2.93 [1.41, 5.65] years).
Discussion
Viral and human genetic variants interact in an intricate, highly dynamic way, and viral variants may increase or decrease in frequency during an infection depending on the host genome [3, 4]. Combining large-scale genome-wide screens [6, 7, 12–14] with a study of these adaptive dynamics [16, 17] is challenging as it usually requires longitudinally sampled treatment-naive sequences. This is often prohibitive for large patient populations, especially in the era of immediate treatment initiation. To overcome these limitations, we developed a viral diversity-based approach based on which we quantified the HLA-dependent dynamics of viral variants in over a thousand people with HIV.
This approach stands out by harnessing the potential of cross-sectional data and hence bridging between the within-host and epidemiological perspectives. We identified a significant HLA-dependent effect of infection time (approximated by APD) on the frequency of 98 individual HLA/HIV-variant pairs as a signature for HLA-dependent selection. Of these, 84 exhibited a pattern to be expected of HLA-escape mutations, i.e., a decreasing frequency over infection time in the absence of the HLA allele and an increasing frequency in its presence. We validated and contextualized these results by assessing the dynamics of viral variants in longitudinally sampled SHCS participants and by testing to what extent the identified viral variants had an HLA-dependent impact on VL and disrupted binding to the corresponding HLA alleles.
The time-to-event analysis revealed the robustness of our findings and underscored the limitations of longitudinal data regarding statistical power and scarcity of observed incident viral mutations. About one-third (28/98) of escape mutations found by the APD-based approach could be confirmed in time-to-event analyses and the corresponding effect sizes of the diversity-based and longitudinal analyses were significantly correlated (ρ spearman = 0.48, p<0.001). Almost half (40/98) of the pairs could not be assessed in the analysis because they had too small numbers of incident mutations. Previous studies circumvented the limited power of such longitudinal data by limiting their analyses to specific viral epitopes and/or HLA alleles, potentially missing associations revealed by our systematic screening of the cross-sectional data [16,17,24].
Twelve HLA/HIV-variant pairs had a significant HLA-dependent effect on VL and the magnitude of their effect on VL was strongly correlated with the magnitude of the HLA-dependent selection inferred in the diversity-based screen (ρ = 0.94, p<0.001). Most (11/12) of these interaction effects resulted in increased VL, consistent with previous findings [12,13] and their classification as HLA-escape mutations. The increase in VL can be explained by the fact that HLA alleles such as B*57:01, B*57:03, or C*12:03 lose their "protective" effect (lowering VL) when an escape mutation like Gag242N occurs [12,13,15]. This is especially relevant when the escape mutation has low fitness costs or reversion rates, as we saw for example with Vif33R and B*57:01 (Figs 2D and 3D). At the population level, this could enhance infectivity and pathogenicity [25, 26], similar to what was described in a Japanese study with homogeneous A*24:02 distribution selecting for the escape mutation Nef135F [27].
Of the twelve pairs that significantly interacted with VL, only the two pairs Vif33R:B*57:01 and Pol432R:A*03:01 were previously described as significantly lowering VL [12,13]. For all others, this interaction effect on VL was either not significant (Gag-264K, Pol54L, Pol68P, Pol91D, and Tat32L) [13] or never identified before (Pol9L, Rev57E, and Nef102H). Conversely, for the newly identified pair Rev57E:B*40:01, we observed an HLA-dependent decrease in VL upon mutation. Rev57E accumulated over time in the absence of B*40:01 but disappeared in its presence, and binding predictions indicated that Rev57E induced a shift from non-binding to strong binding to B*40:01. This suggests that Rev57E increases viral fitness in the absence of B*40:01 but exposes a B*40:01-dependent epitope causing a fitness cost in the presence of B*40:01. These findings offer insights into potential epitopes with high mutational barriers and/or deleterious mutations, which may warrant attention in vaccine design efforts.
The identification of "only" 433 HLA/HIV-variant pairs out of over two million possible combinations may appear modest at first glance. However, it is important to note that it is to be expected that only a minority of positions in the viral genome interact with HLA, due to the selectivity of HLA binding, immuno-dominance, restriction by the proteasome, and T cell receptors, as well as constraints to viral evolution [6, 9]. Moreover, our analysis revealed that even fewer pairs exhibited a time-dependent relationship (98/433), with an even smaller subset demonstrating a significant effect on VL (12/98). It is possible that some of the 433 pairs may be due to spurious associations resulting from population structure and founding effects [28]. However, such confounding by population structure cannot explain the HLA-APD interactions observed in 98 pairs (analysis Ιb) and the effect of HLA in within-patient survival analyses (analysis ΙΙΙ), as these reflect HLA-dependent impacts of time since infection on mutation frequency. This robustness is underlined by the fact that most of these pairs are robust to adjustment for viral population structure (S5F Fig), and hence it represents an additional advantage of the viral diversity-based method established here. In fact, this confounding may be one potential additional reason behind the discrepancy between the number of pairs (n = 335) lost from analysis Ιa to Ιb.
These findings underscore the complexity of the host-virus interaction landscape, and HIV’s ability to adapt and evolve in response to immune pressure. This is particularly evident in the observation that for the majority of variants identified to be under HLA-dependent selection in analysis Ιb (86/98), the interaction effect between HLA and viral variants on VL was statistically insignificant, and they did not exhibit the anticipated virus-load pattern typically associated with escape mutations. This observation may be attributed to limitations in statistical power. However, it is also consistent with previous analyses indicating that moderate differences in viral fitness may often result in no or only weak variations in VL [13, 29].
Overall, we found a weak correlation between HLA-associated selection and predicted disruption of HLA binding (ρ = 0.11, p = 0.38). For 26/58 HLA-viral epitopes, the mutation was predicted to disrupt binding to the HLA allele. For the remaining pairs, e.g., Nef-Y135F:A*24:02, the HLA-dependent selection could not be explained by disruption of HLA binding. Thus, experimental characterizations are necessary to reveal the nature of these escapes, which may operate at different levels such as antigen processing or disruption of T cell receptor recognition.
We identified predominantly HLA/HIV-variant associations within the proteins Gag, Pol, and Nef (consistent with [12]). Despite being rather short proteins, Nef and Vpr exhibited numerous time-dependent associations. Relative to the number of possible combinations, we found for Vpr and Nef even more significant HLA/HIV-variant pairs than for Gag and Pol. Although Nef accumulates many mutations during the course of an HIV infection and these mutations hamper Nef’s function in HLA downregulation [30], many of these mutations exhibit a different pattern than typical HLA escape mutations (as seen in Fig 2D). Similar deviations from this “escape pattern” were also observed for a high proportion of variants in Vpr, Rev, and Tat. These mutations appear to have had no or moderate costs on HIV replication, consistent with prior findings in Nef [31], while Gag or Pol were less tolerant to mutations. Consequently, escape mutations in Gag/Pol are subject to rapid reversion in HLA-unmatched individuals [16]. Alternatively, variants like Rev57G or Nef85L may have been selected in the absence of HLAs and were initially stable escape variants that survived transmission and gradually accumulated in the population until becoming the consensus [24].
Even though we had good sequence coverage of Env in our study, the number of HLA-HIV associations was relatively low compared to the number of possible combinations. This can be attributed to several factors. For one, Env is highly variable, with three times more possible HLA/HIV-variant combinations than for example Pol, thus it may have fewer stable and conserved epitopes that can be presented to HLA molecules. Furthermore, much of the selection pressure on Env comes from antibody escape, so the lack of detection of HLA class Ι-dependent effects could be due to the fact that HLA class Ι-dependent selection plays a weaker role compared to other HLA class ΙΙ or HLA-independent mechanisms [32,33]. This may result both in a lower number of HLA-HIV interaction pairs and a lower statistical power to detect them.
Prior research often excluded HLA class ΙΙ alleles [12,13]. Our observations confirm that HLA class ΙΙ alleles exert comparatively less immune pressure on HIV, with only six such alleles identified through our time-dependent screen. This supports that CD4+ T cell responses may exert less direct influence on viral variants compared to CD8+ T cell responses in the context of HIV infections. Consistent with Gabrielaite et al. [12], none of the HIV variants kept significant associations with the interaction between APD and HLA class ΙΙ when adjusted for linkage disequilibrium between HLA alleles affecting the same viral variants.
A comparison of our findings with those of previous studies and the Los Alamos Immunology database revealed that 32 of the 98 pairs identified through our time-dependent analysis were to our knowledge not previously described (S2 Table) and are not included in [12,13,34,35]. This may be attributed to the fact that our systematic screen had a high sequence coverage across the whole HIV genome, even short genes like vif, vpr, tat, and rev. Additionally, we included HLA class ΙΙ alleles in our analysis. However, it is important to note that these associations should be interpreted with caution, as they may be the result of linkage disequilibrium, as mentioned in the previous paragraph.
Our study bears several limitations. Firstly, although the SHCS comprises individuals from diverse ethnic backgrounds, the majority of participants are of European origin, White ethnicity, and HIV-1 subtype B sequences. Consequently, the generalizability of our results may be limited, as allele frequencies and HIV-1 subtypes can vary among populations [36,37]. Furthermore, the analysis of non-B subtype samples was hindered by small sample sizes, potentially causing us to miss true effects in this population. Nevertheless, all analyses were adjusted for population structures to reduce spurious associations caused by underlying demographic parameters. Secondly, we included plasma and proviral cellular-derived NGS sequences, introducing potential biases, but we show that excluding the proviral sequences did not make a difference. Thirdly, we solely analyzed consensus HIV sequences, disregarding within-host diversity. For future studies, especially those involving longitudinal analyses, conducting investigations of intra-host evolution could offer valuable insights.
In conclusion, we have developed a novel viral diversity-based approach to systematically identify HLA-dependent selection across the HIV genome and map potential escape mutations, and we validate these mutations using longitudinal data and binding predictions. Our approach provides an effective assessment of viral adaptation to their host’s genome in the course of infection. As it is based on cross-sectional data, it allows screening for such adaptations in large populations and genome-wide screens. The approaches used here are likely to be applicable to other chronic infections in which host factors exert similar selection pressures driving the emergence of escape mutations and viral adaptation.
[END]
---
[1] Url:
https://journals.plos.org/plospathogens/article?id=10.1371/journal.ppat.1012385
Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.
via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/