(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .
A cohort-based study of host gene expression: tumor suppressor and innate immune/inflammatory pathways associated with the HIV reservoir size [1]
['Ashok K. Dwivedi', 'Department Of Medicine', 'Division Of Hiv', 'Infectious Diseases', 'Global Medicine', 'University Of California', 'San Francisco', 'California', 'United States Of America', 'Germán G. Gornalusse']
Date: 2024-02
The major barrier to an HIV cure is the HIV reservoir: latently-infected cells that persist despite effective antiretroviral therapy (ART). There have been few cohort-based studies evaluating host genomic or transcriptomic predictors of the HIV reservoir. We performed host RNA sequencing and HIV reservoir quantification (total DNA [tDNA], unspliced RNA [usRNA], intact DNA) from peripheral CD4+ T cells from 191 ART-suppressed people with HIV (PWH). After adjusting for nadir CD4+ count, timing of ART initiation, and genetic ancestry, we identified two host genes for which higher expression was significantly associated with smaller total DNA viral reservoir size, P3H3 and NBL1, both known tumor suppressor genes. We then identified 17 host genes for which lower expression was associated with higher residual transcription (HIV usRNA). These included novel associations with membrane channel (KCNJ2, GJB2), inflammasome (IL1A, CSF3, TNFAIP5, TNFAIP6, TNFAIP9, CXCL3, CXCL10), and innate immunity (TLR7) genes (FDR-adjusted q<0.05). Gene set enrichment analyses further identified significant associations of HIV usRNA with TLR4/microbial translocation (q = 0.006), IL-1/NRLP3 inflammasome (q = 0.008), and IL-10 (q = 0.037) signaling. Protein validation assays using ELISA and multiplex cytokine assays supported these observed inverse host gene correlations, with P3H3, IL-10, and TNF-α protein associations achieving statistical significance (p<0.05). Plasma IL-10 was also significantly inversely associated with HIV DNA (p = 0.016). HIV intact DNA was not associated with differential host gene expression, although this may have been due to a large number of undetectable values in our study. To our knowledge, this is the largest host transcriptomic study of the HIV reservoir. Our findings suggest that host gene expression may vary in response to the transcriptionally active reservoir and that changes in cellular proliferation genes may influence the size of the HIV reservoir. These findings add important data to the limited host genetic HIV reservoir studies to date.
Although lifelong HIV antiretroviral therapy (ART) suppresses virus, the major barrier to an HIV cure is the persistence of infected cells, “the HIV reservoir.” There are limited host genomic HIV reservoir studies to date. We performed a large cross-sectional study of 191 people with HIV on ART and measured the HIV reservoir size and host gene expression (RNA-seq) from blood CD4+ T cells. We found that individuals with higher expression of host genes involved in suppressing cell proliferation (P3H3, NBL1) had a smaller total HIV reservoir size. We also observed that individuals with more “transcriptionally active” HIV reservoir had decreased expression of inflammatory signaling (e.g., IL-1β, TLR7, TNF-α, IL-10) genes as well as two membrane channel proteins (encoded by KCNJ2, GJB2). While we were able to validate some of these findings at the protein level (P3H3, IL-10, TNF-α), further studies are needed to confirm these findings in larger cohorts with longitudinal sampling, including traditionally underrepresented populations in HIV cure research.
Funding: This work was supported in part by the National Institutes of Health: K23GM112526 (SAL), the DARE Collaboratory (U19 AI096109; SGD), the Division of Intramural Research of the National Institutes (MC), UM1 AI126623 (KRJ), NIH/NIAID R01A141003 (TJH), and NIH/NCATS KL2TR002317 (GGG). This work was also supported by the amfAR Research Consortium on HIV Eradication a.k.a. ARCHE (108072-50-RGRL; SGD) and a Collaboration for AIDS Vaccine Discovery (CAVD) grant from the Bill & Melinda Gates Foundation’s Reservoir Assay Validation and Evaluation Network Study Group (RAVEN: INV-008500; MPB). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Data Availability: Our research sharing plan includes the following data: high throughput quantitative human RNA sequencing, ELISA and plasma cytokine protein results. These data have been de-identified for patient confidentiality and are shared with the broader research community in compliance with NIH policy NOT-OD-03-032. These data are now available in the Dryad data repository at
https://datadryad.org/stash/share/JpvuclERLxckKBsjhWODrc0K8GkXh-ayRG3_fP4xZkM The full data citation is: Dwivedi, Ashok et al. (Forthcoming 2023). A cohort-based study of host gene expression: tumor suppressor and innate immune/inflammatory pathways associated with the HIV reservoir size [Dataset]. Dryad.
https://doi.org/10.5061/dryad.k3j9kd5dw .
Here, we performed a cross-sectional study of 191 ART-suppressed HIV+ non-controllers to identify differentially expressed host genes in relation to three measures of the peripheral CD4+ T cell reservoir: HIV cell-associated “intact” DNA (an estimate of the frequency of potentially “replication-competent” virus with intact HIV genomes) [ 31 ], total DNA (“tDNA,” which approximates the total reservoir size, the sum of intact DNA and defective DNA) and unspliced RNA (“HIV usRNA,” which reflects the “transcriptionally active” reservoir) ( Fig 1 ). NBL1 and P3H3, both encoding for tumor suppressor genes that inhibit cell proliferation, were the only two genes host genes significantly associated with HIV total DNA reservoir size; higher expression of these genes was associated with lower HIV total DNA. HIV usRNA was significantly inversely associated with several host genes involved in innate immune and inflammatory signaling, as well as with two genes encoding for membrane channel proteins involved in HIV-1 entry and cell-cell communication. Protein validation in a subset of participants with additional biospecimen availability demonstrated consistent inverse associations as observed in the RNA-seq for HIV total DNA (P3H3) and HIV usRNA (IL-10 and TNF-α). Further studies are needed to validate these findings, ideally with dedicated functional genomic and intracellular protein assays using longitudinal samples to demonstrate causality of these observed associations. Our findings add important clinical and immunologic data to the limited host genomic HIV reservoir studies to date.
Prior host genome wide association studies (GWAS) have focused on predictors of viral control (during untreated HIV disease), identifying key mutations in the C-C chemokine receptor type 5 gene (CCR5Δ32) and the human Major Histocompatibility Complex (MHC) human leukocyte antigen (HLA)-B and -C regions, that influence viral setpoint [ 21 – 24 ]. Recently our group reported these mutations (CCR5Δ32 and HLA -B*57:01) are associated with smaller HIV reservoir size [ 25 ]. However, mRNA expression from DNA variation is complex and not strictly 1:1 DNA to RNA transcription; gene expression is influenced by various factors (alternative splicing, polyadenylation, regulatory enhancers, epigenetic changes, etc.) which may differ by cell type and tissue [ 26 – 28 ]. The limited number of host gene expression studies during HIV infection (e.g., RNA sequencing) have compared gene expression between distinct clinical HIV groups. For example, one prior study compared gene expression among HIV “controllers” (individuals able to control virus in the absence of therapy) versus “non-controllers” [ 29 ]. Another study compared HIV non-controllers initiating ART “early” (<6 months from HIV infection) versus “later” (≥6 months after infection) [ 30 ]. However, no epidemiologic study has examined quantitative measures of the HIV reservoir size in relation to differences in host gene expression.
Despite several unique cases of possible HIV remission [ 1 – 3 ], there is still no HIV vaccine or cure. The major barrier to a cure is the persistence of infected cells during effective antiretroviral therapy (ART). Modern ART has transformed HIV disease into a treatable chronic disease for individuals who have access to, and are able to maintain, viral suppression [ 4 ]. However, ART alone does not eliminate persistent virus in most individuals [ 5 , 6 ]. HIV cure trials aimed at reactivating and eliminating the HIV reservoir have thus far failed to show a clinically meaningful reduction in the HIV reservoir [ 7 – 12 ]. There is an urgent need to bridge drug discovery with a deeper understanding of host-viral dynamics. Although several host factors have been shown to influence the size of the “HIV reservoir”, such as the timing of ART initiation after initial HIV infection [ 13 – 16 ], pre-ART viral load [ 17 ], ethnicity [ 17 ], and sex [ 17 – 20 ], there are few published human genomic and transcriptomic epidemiologic studies describing potential host factors influencing HIV persistence during treated infection.
HIV intact DNA was undetectable in nearly half (48%) of our measured samples ( S3B and S3C Fig ), which may have decreased statistical power to detect differentially expressed genes in relation to HIV intact DNA. As previously described, we employed a conservative approach of assigning a zero value for HIV intact DNA if one of two HIV-1 target assay results were undetectable [ 31 , 87 ], since even one target with defective proviral sequence suggests non-intact HIV DNA. Nonetheless, the frequency of undetectable values in our cohort is considerably higher than prior reports using similar (2 or more target) HIV intact DNA assays [ 88 – 96 ]. In contrast, HIV total DNA results using a different assay (qPCR) was measurable in 95% of samples from our cohort ( S3A and S3C Fig ), suggesting a potential difference due to assay method and/or input sample DNA concentration (HIV intact DNA was performed using remaining extracted DNA after first performing HIV total DNA by qPCR, and there was a general trend demonstrating measurable HIV intact DNA copies with higher concentrations of input sample DNA, S10 Fig ). Nonetheless, among participants with measurable HIV intact DNA, differential gene expression analyses demonstrated a slight positive trend (q<0.25) with PLGLB1 (+6.0%, q = 0.23), which encodes for a protein that inhibits thrombus degradation, and AGL (+0.9%, q = 0.23), which encodes for an enzyme involved in glycogen degradation ( S7 Table ). Gene set enrichment analyses demonstrated trends with pathways involving neutrophil activation (“Neutrophil Degranulation”, q = 0.046; “Neutrophil Activation Involved in Immune Response”, q = 0.046; “Leukocyte Activation”; q = 0.046), and among the European subgroup, pathways associated with myeloid-mediated immunity (“Myeloid Leukocyte Mediated Immunity”; q = 0.058; “Myeloid Cell Activation Involved in Immune Response”; q = 0.060) ( S8 Table ).
HIV usRNA was also inversely associated with two genes, KCNJ2 (-9.7%, q = 0.003) and GJB2 (-7.1%, q = 0.012), encoding for membrane channel proteins, Kir2.1 and connexin 26, respectively ( Table 3 ). KCNJ2 encodes for an inwardly rectifying potassium channel, a class of channels that have been shown to regulate HIV-1 entry and release [ 82 ], and GJB2, or CX26, encodes for gap junction beta 2 protein (also known as connexin 26). Gap junctions act as critical cell-cell communication channels for transport signaling molecules and performing physiologic functions, but are often closed or downregulated under pathologic conditions [ 83 , 84 ]. HIV-1 has been shown to exploit these communication channels to disseminate infection as well as associated inflammation even in the absence of viral replication [ 85 , 86 ]. Of note, KCNJ2-AS1, the antisense long non-coding RNA transcript for KCNJ2, was also statistically significantly associated with HIV usRNA (-8.4%, q = 0.012) ( Table 3 ). Since both KCNJ2 and GJB2 encode for membrane-associated proteins [ 63 , 64 ], we performed protein validation from CD4+ T cell isolates for whom we had remaining PBMCs (as we had done above for intracellular protein validation of P3H3 and NBL1). For both genes, RNA and protein expression levels were positively correlated, and the correlation was statistically significant for GJB2/connexin 26 (Spearman R = 0.37, p = 0.02) ( S9 Fig ), but we did not observe significant correlations with these proteins as observed in our RNA-seq analyses when testing peripheral CD4+ T cells from a small subset of 40 participants in our cohort ( Fig 6 , S6 Table ).
Since several of the host genes and gene sets associated with HIV usRNA reflected cytokine signaling ( Tables 3 and S3 , Figs 3 and 4 ), we obtained matched cryopreserved plasma samples from 175 of the 191 participants and designed a series of high-sensitivity multiplex cytokine protein assays (Meso Scale Diagnostics and LSBio). We were able to design assays for eight proteins: IL-1α, IL-1β, IL-10, TNF-α, G-CSF, IP-10, TNFAIP5, and sTLR4. Unfortunately, we were unable to perform protein validation of IL-1α since plasma levels were undetectable in most of our samples, potentially due to this cytokine’s primarily intracellular expression, mostly from monocytes/macrophages [ 78 – 81 ]. Overall, for the 7 genes, the RNA and protein expression levels were positively correlated, and were statistically significant for IL-10, TNF-α, and IL-1β (IL-10: Spearman R = 0.34, p = 4.3x10 -6 ; TNF-α: Spearman R = 0.19, p = 0.011; IL-1β: Spearman R = 0.29, p = 1.9x10 -4 ) ( S7 Fig ). Of the final set of 7 plasma cytokines assayed, two cytokines, IL-10 and TNF-α, were significantly correlated with HIV usRNA (IL-10: Spearman R = -0.17, p = 0.025; TNF-α: Spearman R = -0.23, p = 0.0018) ( Fig 5 ), although these associations did not meet statistical significance in multivariate linear models adjusted for significant covariates ( S5 Table ). Of note, plasma IL-10 was also significantly inversely correlated with HIV total DNA (Spearman R -0.18, p = 0.016) ( S8 Fig ), even though the IL10 gene did not meet FDR-adjusted statistical significance in the RNA-seq analysis with HIV DNA per se ( S2 Table ).
Given the large number of statistically significant host genes (Tables 3 and S3 Table ), we used network analysis to group the top-ranked genes (q<0.25) into biologically interpretable clusters (ClueGo Network software). Host genes related to NRLP3 inflammasome activation (e.g., IL-1β), Th2 cell cytokine production (e.g., IL-10), and bacterial translocation (e.g., TLR4, lipopolysaccharide) signaling were significantly associated with HIV usRNA. A Benjamini-Hochberg false discovery rate (FDR) of q<0.05 was used to generate nodes (circles) based on kappa scores ≥0.4. The size of the nodes reflects the enrichment significance of the terms, and the different colors represent distinct functional groups. Created with
https://apps.cytoscape.org/apps/cluego .
Given the large number of genes associated with HIV usRNA, we also performed network analyses to interpret clusters of pathways from the gene expression data. We applied the ClueGo network analysis application to visualize the ranked genes (q<0.25) into biologically interpretable clusters [ 67 ]. Several key pathways involving inflammasome activation [ 68 – 71 ] and bacterial translocation [ 72 – 74 ] genes were strongly associated with HIV usRNA. These include gene sets involving NLRP3 (NOD-, LRR- and pyrin domain-containing protein 3) inflammasome/IL-1β signaling, as well as pathways involving microbial translocation, such as toll-like receptor 4, lipopolysaccharide (LPS), and IL-17 signaling ( Fig 3 ). We also used unbiased gene set enrichment analyses (GSEA) using rank-ordered genes in the entire transcriptome to further identify biologically relevant clusters of genes associated with HIV usRNA. Pathways reflecting microbial translocation (“Response to Bacterium”, q = 7.5x10 -5 ; “Cellular Response to Lipopolysaccharide”, q = 0.006), IL-1 signaling (“Interleukin-1 beta production”, q = 0.008; “Regulation of Interleukin-1 Production”, q = 0.008), and cytokine production (“Tumor Necrosis Factor Production”, q = 0.006; “Tumor Necrosis Factor Superfamily Cytokine Production”, q = 0.006; “Regulation of Tumor Necrosis Factor Production”, q = 0.008) were again associated with HIV usRNA. In addition, several gene sets related to IL-10 signaling (“regulation of interleukin-10 production”, q = 0.037, “Interleukin-10 production”, q = 0.041), a pleiotropic cytokine associated HIV immune dysregulation and persistence [ 75 – 77 ], were also significantly associated with HIV usRNA (q = 0.04) ( Fig 4 , S4 Table ).
HIV unspliced RNA was significantly associated with differential expression of several host genes involved in innate immunity, inflammasome activation, and inflammation. In contrast to HIV total DNA, which measures the total reservoir size, HIV RNA roughly estimates the “transcriptionally active” HIV reservoir [ 65 , 66 ], which among the participants with measurable HIV intact DNA ( S3 Fig ), was significantly correlated with usRNA. A total of 17 host genes, the majority of which reflect inflammatory pathways, were significantly lower among individuals with higher HIV usRNA, including after adjustment for significant covariates, timing of ART initiation, nadir CD4+ T cell count, genetic ancestry (PCs), and residual variability (PEERs) ( Table 3 ). Specifically, there was approximately a 5–10% decrease in the expression of these host genes for each two-fold increase in HIV usRNA, based on our multivariate model estimates. Host genes associated with HIV usRNA represented inflammasome activation and tumor necrosis factor (IL1A: -9.6%, q = 0.012, CSF3: -7.5%, q = 0.013; TNFAIP6: -7.6%, q = 0.016, TNFAIP9: -6.9%, q = 0.031, TNFAIP5: -5.9%, q = 0.043), innate immunity (TLR7: -7.1%, q = 0.016), and chemokine (CXCL3: -7.2%, q = 0.043; CXCL10: -9.2%, q = 0.049) signaling genes ( Tables 3 and S3 ). The overall expression of these genes in our cohort was low ( Tables 3 and S3 ) but overall consistent with average normalized population gene expression reported in the Human Cell Atlas [ 59 ].
For a subset of 40 participants for whom we had remaining PBMC aliquots, we were able to perform protein validation of P3H3 and NBL1 associations with HIV total DNA ( Fig 2 ). Both genes encode for intracellular proteins [ 63 , 64 ] and thus, we performed CD4+ T cell isolation followed by ELISA (enzyme-linked immunoassay). P3H3 protein expression levels from peripheral CD4+ T cells demonstrated a significant correlation with HIV total DNA (Spearman R = -0.44, p = 0.0043) ( Fig 2B ), consistent with the RNA-seq observations. Similarly, NBL1 also demonstrated a non-significant inverse trend with HIV total DNA (Spearman R = -0.29, p = 0.073) ( Fig 2D ), consistent with the RNA-seq results. Overall, RNA and protein expression levels were positively correlated, but in this small sample, did not meet statistical significance (P3H3: Spearman R = 0.29, p = 0.067; NBL1: Spearman R = 0.22, p = 0.17) ( S6 Fig ). The inverse associations between protein expression and HIV total DNA were still observed in multivariate models controlling for significant covariates, timing of ART initiation and nadir CD4+ T cell count, but did not meet statistical significance at p<0.05; for each two-fold change in HIV total DNA, there was a -1.2% change in NBL1 protein expression, p = 0.060 ( S2 Table ). Both NBL1 and P3H3 are intracellularly expressed proteins, and protein levels may vary by tissue (e.g., NBL1 is primarily expressed in the central nervous system [ 63 , 64 ]). In our small sample size, we were only able to test whether the RNA-seq findings validated at the protein level from peripheral CD4+ T cells.
Peripheral CD4+ T cells isolated by magnetic negative selection were subjected to bulk host mRNA sequencing. Differential gene expression analyses demonstrated that individuals with higher NBL1 and P3H3 gene expression had significantly lower HIV total DNA (measured by percent change in host gene expression per two-fold change in HIV total DNA; NBL1: -1.8%, q = 0.012; P3H3: -1.6%, q = 0.012) in multivariate models controlling for significant covariates, nadir CD4+ T cell count, timing of ART initiation, genetic ancestry (PCs [ 36 , 37 ]), and residual variability (PEERs [ 45 ]) ( Table 2 ). P3H3 encodes for Prolyl 3-Hydroxylase 3, which plays a key role in collagen biosynthesis, affecting properties of the extracellular matrix [ 46 – 49 ], and has been previously been shown to act as a tumor suppressor in breast, lymphoid, and other cancers [ 50 – 52 ], while NBL1 encodes for neuroblastoma suppressor of tumorigenicity 1 [ 53 , 54 ], a transcription factor that is involved in the negative regulation of cell cycle (G1/S transition) [ 55 – 58 ]. The overall expression for these genes was low but were consistent with population mean normalized gene expression from the Human Cell Atlas [ 59 ]. Analyzing NBL1 and P3H3 gene expression in transcripts per million (TPM), in addition to analyzing these as normalized gene counts (standard protocol for bulk RNA-seq analyses which include filtering steps to remove low-expressed genes [ 60 – 62 ]), yielded similar results (NBL1: average TPM 3.37, Spearman R = -0.21, p = 0.0029; P3H3, average TPM = 2.10, Spearman R = -0.31, p = 1.5x10 -5 ). We also performed unbiased gene set enrichment analyses (GSEA) across the transcriptome using rank-ordered genes by q-value and determining normalized enrichment scores ( S1 Table ). HIV total DNA was statistically significantly associated with pathways involving complement activation and humoral immune response (e.g., “regulation of complement activation”, q = 9.8x10 -6 ; “humoral immune response mediated by circulating immunoglobulin”, q = 1.6x10 -5 ; “B cell mediated immunity”, q = 0.002; “Fc-gamma receptor signaling pathway”, q = 0.020), but these associations were only observed within the European ancestry subgroup.
Earlier timing of ART initiation and higher nadir CD4+ T cell count were associated with smaller HIV reservoir size in our cohort, consistent with prior reports [ 17 , 39 , 43 ]. Earlier timing of ART initiation (<6 months from infection) was significantly associated with lower levels of total DNA (Spearman R = 0.29; p = 2.3x10 -5 ) and usRNA (Spearman R = 0.28; p = 4.2x10 -5 ) and demonstrated a trend with HIV intact DNA (Spearman R = 0.14; p = 0.061) ( S4 Fig ). Lower nadir CD4+ T cell counts were associated with higher total HIV DNA (Spearman R = -0.26; p = 2.3x10 -4 ), as well as with higher HIV usRNA (Spearman R = -0.30; p = 1.5x10 -5 ) and HIV intact DNA (Spearman R = -0.27; p = 3.7x10 -4 ) ( S5 Fig ). We did not observe a significant association with duration of ART suppression, age, or pre-ART viral load. Given the low frequency of females and transgender participants in our study, we were unable to formally compare results based on sex/gender, but sensitivity analyses suggested that inclusion of these participants did not change our overall findings and thus results are shown for the entire combined cohort.
CD4+ T cells from cryopreserved PBMCs were isolated by magnetic negative selection, and RNA was extracted for HIV reservoir quantification (usRNA) and host transcriptomics (RNA-seq) while DNA was extracted for HIV reservoir quantification (tDNA, intact DNA). Most of the HIV reservoir consists of cells harboring defective virus [ 38 , 39 ], while the “replication-competent” reservoir measures that HIV-infected cells harboring intact DNA, capable of producing infectious virions [ 31 , 40 , 41 ]. Currently, there is no “gold standard” for measuring the HIV reservoir [ 42 , 43 ]. Here, we measured HIV total DNA (tDNA), which approximates the total defective and intact proviral DNA reservoir, and HIV unspliced RNA (usRNA), which estimates the “transcriptionally active” HIV reservoir, using an in-house qPCR TaqMan assay [ 44 ]. Using the remaining extracted DNA from the CD4+ T cells, we performed a multiplexed ddPCR assay targeting three regions of the HIV-1 genome to quantify the frequency of cells with “intact” HIV (a proxy for the frequency of replication-competent provirus) [ 31 ]. Of the three measures that we performed to quantify the HIV reservoir, HIV total DNA and unspliced RNA were highly correlated with one another (both quantified using quantitative, qPCR), Spearman R = 0.55, p = 1.6x10 -17 ( S3 Fig ). HIV intact DNA (performed as a separate droplet digital, ddPCR assay using remaining DNA samples) was significantly associated with HIV usRNA (Spearman R = 0.26, p = 5.0x10 -4 ) but not HIV total DNA. This may have been due to unusually high proportion of our study population with undetectable values (48%) for HIV intact DNA (while HIV total DNA by qPCR was measurable in 95% of samples) as described further in the Discussion section.
A total of 191 ART-suppressed participants were selected from the UCSF SCOPE and Options cohorts ( S1 Fig ). HIV “controllers” [ 32 – 34 ] were excluded (individuals with undetectable viral loads in the absence of therapy for ≥1 year). Estimated date of detected infection (EDDI) was calculated to determine recency of infection in relation to ART initiation using the Infection Dating Tool (
https://tools.incidence-estimation.org/idt/ ) [ 35 ]. The study included individuals who initiated ART during early (within 6 months) and chronic (>6 months) HIV ( Table 1 ). The median age of the cohort was 47 years, nadir CD4+ T cell count, 352 cells/mm 3 , pre-ART viral load, 5.1 log 10 copies/mL, and years of ART suppression, 5.1 years. Consistent with our San Francisco-based study population, participants were mostly male (96%) and reported diverse ethnicity, which was reflected in our principal component analysis (PCA) [ 36 ] based on our previously published host DNA exome sequencing data [ 37 ] ( S2 Fig ). PCs generated for each participant were included in all downstream multivariate models to adjust for potential confounding by genetic ancestry.
Discussion
To our knowledge, this is the largest cohort-based transcriptomic study of host genetic predictors of the HIV reservoir. We observed only two host genes (P3H3, NBL1) that were significantly (inversely) associated with HIV total DNA, both of which are known tumor suppressor genes and regulate cell cycle. We also observed 17 host genes that were significantly associated with HIV usRNA, all of which demonstrated an inverse relationship with HIV usRNA and are highly interrelated pathways involved in inflammation (e.g., IL-1, IL-6, IL-10, TNF-α, TLR4, NRLP3 inflammasome signaling). We did not observe any host genes that were significantly associated with HIV intact DNA, but this may have been due to a large number of undetectable provirus in our population, potentially due to low sample input DNA concentrations. Protein validation from a subset of participants with remaining biospecimens supported a significant correlation between HIV total DNA and P3H3 expression (from CD4+ T cells), and between HIV usRNA with plasma IL-10 and TNF-α levels.
Prior HIV integration studies have identified several host oncogenes and/or cell cycle genes that enriched for HIV integrations during long-term ART [97–103]. Tumor suppressor genes encoding for p53 (TP53) and p21 (CDKN1A) have previously been associated with inhibition of HIV early replication [104] and blockade of HIV infection [105]. We observed statistically significant associations for HIV total DNA and two tumor suppressor genes, NBL1 and P3H3, using a stringent false discovery rate q<0.05. In a recent ex vivo analysis of CD4+ T cells from rhesus macaques after HIV-1 Env immunization and antibody co-administration, NBL1 was identified as a host gene that was differentially expressed in all treated (CTLA-4, PD-1, and CTLA-4 + PD-1 Ab) versus control groups [106]. Our protein validation of these two host genes associated with HIV total DNA from peripheral CD4+ T cells demonstrated similar inverse trends as observed in our RNA-seq analyses. It is unclear whether these particular genes (P3H3 and NBL1) versus “tumor suppressor” genes in general, which broadly function as regulators of cell cycle, may have important roles in HIV persistence.
A much larger set of host genes were strongly associated with HIV usRNA. Since we analyzed over 20,000 genes in the human transcriptome, stringent false discovery rate (FDR)-correction methods were employed. These included data dimensionality reduction approaches (principal component analyses [36]) using whole exome data [37] and inclusion of PEER factors (probabilistic estimator of expression residuals) in multivariate models to further account for residual variability in the data [45]. Even after these additional measures, we observed 17 statistically significant genes associated with HIV usRNA that met stringent FDR q<0.05 in several interrelated immune signaling pathways, many of which have been previously shown to play an important role in the host response to HIV [75–77,107–138].
We were able to perform protein validation of several of these genes reflecting secreted cytokines using matched plasma samples from 175 of the 191 study participants (encoding for IL-1α, IL-1β, IL-10, TNF-α, G-CSF, IP-10, TNFAIP5, and sTLR4; IL-1α was undetectable in most of our samples despite high-sensitivity assay, potentially due to its intracellular expression [78–81]). Plasma IL-10 and TNF-α were significantly associated with HIV usRNA (TNF-α, even after adjustment for nadir CD4+ T cell count and timing of ART initiation). Interestingly, plasma IL-10 was also inversely associated with HIV total DNA. IL-10 is a complex pleiotropic cytokine that has been highly studied in several autoimmune and infectious diseases and possess complex actions that vary by stage of infection and by tissue [139]. IL-10 can both inhibit pathogen clearance and reduce excessive immunopathology, thus exhibiting both an inflammatory and regulatory response [140,141]. For example, IL-10 plays a critical role in intestinal homeostasis and both induces and prevents mucosal damage [142]. Thus, in several infectious diseases (viral, bacterial, fungal, parasitic), the effect of IL-10 varies by stage of infection and by tissue [139–141,143–148]. Here, as observed with several of the other differentially expressed host genes in relation to HIV usRNA, plasma IL-10 was inversely associated with HIV usRNA. These findings are in contrast to a recent non-human primate study by Harper et. al., where rhesus macaques ART-suppressed for 7 months demonstrated that higher plasma IL-10 levels were associated with larger SIV DNA reservoirs, suggesting that IL-10 maintains long-lived reservoir cells [75]. (The authors did not evaluate or report findings for SIV RNA but did show that in vivo neutralization of soluble IL-10 with a monoclonal antibody resulted in a 2-log increase in plasma IL-10). Differences in timing of ART initiation, duration of ART, and/or cross species differences might explain our contrasting findings. An alternate explanation might be that during long-term ART, IL-10 contributes to an ongoing dynamic interplay between the host immune response and low levels of HIV transcription, similar to complex host-viral dynamics that have been described associated with type I interferons during acute versus chronic infection [149].
Plasma TNF-α was the other cytokine significantly associated with HIV usRNA, consistent with the RNA-seq associations with TNF pathway genes (TNFAIP5, TNFAIP6, TNFAIP9). These TNF-associated pattern recognition receptors (PRRs) not only respond to TNF-α, but also respond to IL-1 signaling and toll-like receptor (TLR) engagement [116–119], and negatively regulate NF-κB signaling and IL-6 production [150–155]. IL-10 has been shown to block HIV-induced TNF-α and IL-6 secretion and inhibit HIV replication [156], and a cohort study of 51 people with HIV suppressed on ART demonstrated decreasing plasma IL-10/TNF-α ratio to be associated with AIDS progression [157].
The remaining host genes associated with HIV usRNA did not replicate in our limited set of protein validation assays, but collectively represent highly interrelated immune pathways that warrant further study. Plasma IL-6 significantly predicts mortality in ART-suppressed people with HIV in several large cohort studies [110–114]), and IL-1β, an upstream inducer of IL-6, has emerged as a major target for HIV immune modulation [107,158,159]. IL-1α can act as a “dual function” cytokine, directly sensing intracellular DNA damage as well as a proinflammatory mediator, but it is mostly found intracellularly [78,79]. We observed two toll-like receptors (TLRs) from our analyses, TLR7 (in the differential gene expression analysis) and TLR4 (in the pathway-based analysis), to be associated with HIV usRNA. TLR7 encodes for a PRR that senses HIV single-stranded RNA [160,161] and has been associated with viral persistence in several human and non-human primate studies [128–130]. Host TLR7 transcriptional activity has been linked to acute viremia in women with HIV [132] as well as with enhanced innate immune function (i.e., IFN-α and TNF-α production) [131], but TLR7 is only expressed intracellularly [162], so we were unable to include it in our protein validation experiments. TLR4 encodes for a PRR mediating the inflammatory response to microbial translocation [163] (e.g., bacterial endotoxin products such as lipopolysaccharide [164], which are significantly higher in people with HIV compared to uninfected individuals and are associated with AIDS progression [133–138]). CXCL3 and CXCL10 encode for chemokines that recruit immune cells to sites of inflammation [165,166] and signal through innate immune pathways (e.g., TLRs) [122]. IP-10 (encoded for by CXCL10) has been strongly linked to HIV disease progression and persistence [125–127]. CSF3 encodes for G-CSF (granulocyte stimulating factor), which is also part of the IL-6 superfamily of cytokines [115]. G-CSF (granulocyte stimulating factor), encoded by CSF3, has been shown to regulate T cell responses via induction of IL-10 [167], inhibiting CD4+ and CD8+ T cell responses [168], and has been shown to increase IL-10-producing regulatory T cells (Tregs) [169].
We also identified novel associations between HIV usRNA and two genes (KCNJ2, GJB2) encoding for membrane channel proteins previously linked to HIV-1 entry and release, and cell-cell communication, respectively. Tight regulation of potassium ion concentrations have been shown to play a critical role in HIV-1 virus production in CD4+ T cells in cell culture models [170], and the HIV Nef protein has been shown to increase K+ concentrations in cells [171] while changes in K+ concentration have been shown to regulate the HIV life cycle (e.g., viral entry, replication, and release) [82]. HIV-1 has been shown to exploit gap junction protein channels to disseminate infection as well as associated inflammation even in the absence of viral replication [85, 86], and a growing body of literature strongly suggests that connexins intensify inflammation by facilitating damage-associated molecular pattern release and binding to PRRs such as TLRs [172]. From our small subset of 40 participants, however, protein expression from CD4+ T cells did not validate the RNA-seq findings.
The study has several limitations that deserve mention. First, although the HIV reservoir has been shown to be relatively stable over time [17,91,173], our cross-sectional design limits the ability to demonstrate causality and simply provides a “snapshot” of the HIV reservoir after a median of 5.1 years of ART suppression. However, based on the known functions of the significantly associated host gene, we have proposed at least two potential models that need to be further validated in longitudinal and functional genomics studies. Indeed, the true in vivo associations might involve more complex feedback pathways between the HIV reservoir and host responses. Second, as is characteristic of our San Francisco-based HIV+ population, our study included mostly males of European ancestry. We accounted for this using well-established GWAS-based methods to adjust for population stratification bias [36,174], as well as the use of PEERs [45], which account for residual variance that is characteristic of RNA-seq data. Third, we quantified the peripheral CD4+ T cell reservoir, but the majority of the HIV reservoir persists in lymphoid tissues, such as in the gut-associated lymphoid tissues [175,176]. Recent data suggests that the tissue compartment largely reflects (and is the likely source of) the peripheral compartment [177–179]. While several of the genes associated with HIV usRNA reflect tissue-based inflammation (e.g., IL-10), future studies will need to determine whether our findings from the blood reservoir are relevant to the tissue HIV reservoir. We also performed bulk RNA-seq from peripheral CD4+ T cells and did not perform separate analyses by HIV-infected versus uninfected cells (given the limited availability of PBMCs for our study participants). Thus, the interpretation of our findings likely largely represents host gene expression differences among HIV-uninfected cells, given the infrequency of CD4+ T cells harboring provirus in an aliquot of 10 million cells. Finally, while we selected participants to focus on HIV “non-controllers,” our findings may also be applicable to HIV elite and/or post-treatment controllers. The purpose of this transcriptomic analysis as well as our previously published whole exome sequencing analysis [37] from this cohort were to focus on previously undescribed host genetic predictors of the HIV reservoir (signals that might be lost amidst a study population enriched for previously reported strong genetic effects, such as with HLA and/or CCR5Δ32 [21–24]) that may be applicable to the large majority of people with HIV who are unable to suppress virus in the absence of therapy.
We did not observe statistically significant associations with HIV intact DNA and host genes. We believe that this may be because HIV intact DNA was undetectable in 48% of our measured samples (while for example, total DNA was measurable in 95% of samples). Here, we performed the HIV intact DNA assay after already allocating DNA for whole exome sequencing and for HIV total DNA quantification by qPCR [37]. Therefore, the low intact DNA detection rate may have been due to (1) low sample input DNA, (2) low true frequencies of cells containing intact proviruses, and/or (3) misclassification of “intact” provirus (primer/probe mismatches described for these type of assays [96]). We quantified HIV intact DNA by targeting five regions of the HIV genome, including regions that are highly conserved when present but are also often deleted, as well as an Env region with frequent hypermutations, across two droplet digital PCR reactions [31]. This allowed the analysis of potentially replication-competent (“intact”) proviral genomes by quantifying the number of droplets positive for 3 targets per each of the two reactions, and then mathematically combining the results of both reactions to calculate the number of HIV genomes containing all 5 regions (5T-IPDA). Zero intact proviral copies was assigned when either of the two reactions failed to detect all three regions in any of the droplets. While we are unable to determine potential primer-mismatches as an underlying cause (since we did not perform full-length or near full-length sequencing), this may have also influenced our results since we have previously shown that the frequency of primer mismatches is likely higher in assays interrogating a larger number of HIV-1 targets (e.g., our 5-target assay compared to a previously published 2-target assay [42]).
Overall, findings from our cross-sectional cohort identified several biologically plausible genes and immune pathways that may be associated with the HIV reservoir. In particular, we observed two host genes associated with cell cycle regulation (previously described in relation to tumor suppression) as well as several host genes involved in innate immunity and/or inflammation to be associated with peripheral measures of the HIV DNA and RNA reservoirs, respectively. Several of the inflammatory and innate immune genes and pathways associated with HIV usRNA are highly interrelated signaling pathways related to pathogen recognition (TLRs, NLRs), inflammasome activation (IL-1, IL-6), mucosal homeostasis (IL-10, TLR4), and inflammation (TNF-α, chemokine release). In addition, we report two novel associations with genes encoding for membrane channel proteins that may play a role in these same inflammatory pathways. A limited set of protein validation assays were performed which showed that the association between IL-10 and TNF-α was also significantly inversely associated with HIV usRNA. These discovery-based transcriptomic findings add to the limited host genetic and HIV reservoir literature and suggest that while changes in host gene expression may influence the size of the HIV reservoir, host gene expression itself may in turn, vary in response to the transcriptionally active reservoir. Additional studies in larger cohorts are needed to further validate these findings.
[END]
---
[1] Url:
https://journals.plos.org/plospathogens/article?id=10.1371/journal.ppat.1011114
Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.
via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/