(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .



Multi-omic association study identifies DNA methylation-mediated genotype and smoking exposure effects on lung function in children living in urban settings [1]

['Matthew Dapas', 'Department Of Human Genetics', 'University Of Chicago', 'Chicago Illinois', 'United States Of America', 'Emma E. Thompson', 'William Wentworth-Sheilds', 'Selene Clay', 'Cynthia M. Visness', 'Rho Inc.']

Date: 2023-01

Impaired lung function in early life is associated with the subsequent development of chronic respiratory disease. Most genetic associations with lung function have been identified in adults of European descent and therefore may not represent those most relevant to pediatric populations and populations of different ancestries. In this study, we performed genome-wide association analyses of lung function in a multiethnic cohort of children (n = 1,035) living in low-income urban neighborhoods. We identified one novel locus at the TDRD9 gene in chromosome 14q32.33 associated with percent predicted forced expiratory volume in one second (FEV 1 ) (p = 2.4x10 -9 ; β z = -0.31, 95% CI = -0.41- -0.21). Mendelian randomization and mediation analyses revealed that this genetic effect on FEV 1 was partially mediated by DNA methylation levels at this locus in airway epithelial cells, which were also associated with environmental tobacco smoke exposure (p = 0.015). Promoter-enhancer interactions in airway epithelial cells revealed chromatin interaction loops between FEV 1 -associated variants in TDRD9 and the promoter region of the PPP1R13B gene, a stimulator of p53-mediated apoptosis. Expression of PPP1R13B in airway epithelial cells was significantly associated the FEV 1 risk alleles (p = 1.3x10 -5 ; β = 0.12, 95% CI = 0.06–0.17). These combined results highlight a potential novel mechanism for reduced lung function in urban youth resulting from both genetics and smoking exposure.

Lung function is determined by both genetic and environmental factors. Impairment of lung function can result from harmful environmental exposures in early life, which disproportionally affect children living in low-income, urban communities. However, most genetic association studies of lung function have been performed in adults and without regard for socioeconomic status. Therefore, genetic risk factors discovered to date may not reflect those most relevant to high-risk populations. In this study, we sought to identify genetic variants correlated with lung function in a multiethnic cohort of children living in low-income, urban neighborhoods and analyze how tobacco smoke exposure may influence any genetic effects. We discovered a common genetic variant associated with lower lung function in this population, and we found that the association was mediated by nearby epigenetic changes in DNA methylation, which were in turn correlated with smoking exposure. We then identified a nearby gene, PPP1R13B, which is known to aid in the deactivation of damaged cells, whose expression in airway cells aligned with these genetic and epigenetic effects. This study reveals a potential mechanism through which genetic risk and environmental exposures can affect airway development, perhaps leading to interventions that can help reduce the burden of asthma in socioeconomically disadvantaged children.

Competing interests: All authors, with the exception of M. Altman and P. Becker, report grants from NIH/NIAID during the conduct of study. M. Dapas, C. Visness, A. Calatroni and P. Becker have nothing to disclose outside the submitted work. C. Ober reports personal fees from American Association of Asthma, Allergy and Immunology, outside the submitted work. M. Altman reports personal fees from Sanofi-Regeneron outside the submitted work. W. Busse reports personal fees from Boston Scientific, Novartis, Glaxo SmithKline, Genentech, Sanofi/Genzyme, AstraZeneca, Teva, Regeneron and Elsevier outside the submitted work. M. Gill reports an honorarium for and support for travel to the 2017 AAAAI meeting during the conduct of study and monetary compensation from the American Academy of Pediatrics for her work teaching the biannual Pediatrics board review course, PREP The Course. K. Hershey reports grants from Adare, during the conduct of the study. D. Jackson reports personal fees from Novartis, Boehringer Ingelheim, Pfizer, Regeneron, AstraZeneca, Sanofi and Vifor Pharma, grants and personal fees from GlaxoSmithKline and grants from NIH/NHLBI, outside the submitted work. M. Kattan reports personal fees from Regeneron, outside the submitted work. R. Gruchalla reports government employment from Center for Biologics Evaluation and Research as well as personal fees from Consulting Massachusetts Medical Society, outside the submitted work. A. Liu reports personal fees from Phadia ThermoFisher as consulting honoraria, grants and non-financial support from ResMed/Propeller Health, non-financial support from Revenio, grants and personal fees from Avillion and personal fees from Labcorp, outside the submitted work. L. Bacharier reports personal fees from GlaxoSmithKline, Genentech/Novartis, DBV Technologies, Teva, Boehringer Ingelheim, AstraZeneca, WebMD/Medscape, Sanofi/Regeneron, Vectura and geCircassia and personal fees and non-financial support from Merckoutside the submitted work. J. Gern reports personal fees from AstraZeneca and Gossamer Bio and personal fees and stock options from Meissa Vaccines Inc, outside the submitted work. In addition, Dr. Gern has a patent Methods of Propagating Rhinovirus C in Previously Unsusceptible Cell Lines issued, and a patent Adapted Rhinovirus C issued. G. O’Connor reports personal fees from AstraZeneca and grants from Janssen Pharmaceuticals, outside the submitted work. R. Wood reports grants from DBV, Aimmune, Astellas, HAL-Allergy and Regeneron and royalties from Up to Date, outsite the submitted work. Dr. Becker’s co-authorship of this publication does not necessarily constitute endorsement by the National Institute of Allergy and Infectious Diseases, the National Institutes of Health or any other agency of the United States government.

Data Availability: Phenotype, genotype, GWAS summary statistics, and whole-genome sequencing files are available in dbGaP under accession phs002921.v1.p1. The URECA gene expression data are available on the Gene Expression Omnibus (GEO) under accession numbers GSE145505 (NECs) and GSE96783 (PBMCs). The URECA DNA methylation data are available in GEO under the reference series GSE217337. The pcHI-C data are available in GEO under the reference series GSE152550.

In this study, we analyzed measures of lung function from the Asthma Phenotypes in the Inner City (APIC) [ 55 , 56 ] and Urban Environment and Childhood Asthma (URECA) cohorts [ 57 ], which consist of children living in low-income neighborhoods in 10 U.S. cities. We performed whole-genome sequencing (WGS) on 1,035 participants from APIC and URECA (ages 5–17 years; 67% non-Hispanic Black, 25% Hispanic; 66% with doctor-diagnosed asthma) and performed a GWAS with FEV 1 and the FEV 1 /FVC ratio. We then performed expression quantitative trait locus (eQTL) and methylation quantitative trait locus (meQTL) mapping in airway epithelial cells and peripheral blood mononuclear cells (PBMCs) from a subset of the URECA children. We further tested for genotype and DNA methylation interactions with smoking exposure. We aimed to identify methylation-mediated genetic and smoking exposure associations with lung function, linking environmental effects, epigenetic modifications, and specific genetic risk alleles to reduced pulmonary health in urban youth.

Environmental risk factors disproportionally affect socioeconomically disadvantaged children, particularly those living in urban environments [ 47 , 48 ]. In fact, socioeconomic effects contribute to disparities in lung health [ 49 ], including the higher burden of chronic respiratory disease among Black and Hispanic children compared to non-Hispanic white children [ 49 – 52 ]. Most genetic association studies of lung function, however, have been limited to adults of European descent. Therefore, genetic risk factors discovered to date may not reflect those most relevant to high-risk populations, which can further exacerbate health disparities [ 53 , 54 ]. Identifying genetic variants and epigenetic variation associated with lung function in high-risk, multiethnic, pediatric populations may provide more direct insights into the early development of impaired lung function.

Genetic factors contribute to differences in lung function among individuals, with heritability estimates ranging from 0.50 for FEV 1 to 0.66 for FEV 1 /FVC ratio [ 13 ]. The many genome-wide association studies (GWAS) of lung function measures [ 14 – 25 ] have implicated pathways related to lung development [ 20 , 26 – 28 ], inflammation [ 26 ], and tissue repair [ 29 ], among others [ 29 ]. Lung function is also affected by environmental exposures, such as smoking [ 30 – 32 ] and air pollution [ 33 ], which can disrupt airway development in early life, increasing the risk of childhood asthma and perhaps other chronic obstructive diseases [ 8 – 12 , 34 , 35 ]. For example, exposure to second hand smoke in utero and through childhood is associated with increased risk of childhood asthma [ 36 ], lower lung function in adolescence [ 37 ], and larger declines in lung function later in life [ 38 , 39 ]. Such adverse exposures are known to alter the epigenetic landscape in exposed individuals [ 40 , 41 ], potentially mediating downstream biological effects [ 42 – 44 ] and modifying genetic associations with lung function [ 45 , 46 ].

Reduced lung function is a hallmark of asthma and chronic obstructive pulmonary disease (COPD). Lung function measures, such as forced expiratory volume in one second (FEV 1 ) and forced vital capacity (FVC), are strong predictors of future all-cause mortality [ 1 – 6 ]. Airway obstruction often begins in early life [ 7 – 10 ], with lower lung function in infancy being a risk factor for the development of asthma in childhood [ 11 ] and COPD in late adulthood [ 12 ].

The TDRD9 locus was significantly associated with FEV 1 (% predicted) in the APIC and URECA cohorts. This association was partially mediated by DNA methylation at the cg03306306 CpG site in TDRD9 in NECs, which was also significantly associated with environmental tobacco smoke exposure. The TDRD9 risk allele and DNA methylation were both significantly associated with PPP1R13B gene expression, but PPP1R13B gene expression was not significantly correlated with FEV 1 itself. Unidirectional arrows represent inferred causality.

Promoter-to-enhancer chromatin interactions captured by Hi-C in nasal epithelial cells from URECA at age 11 are displayed as grey arcs. SNPs associated with FEV 1 (p<1x10 -5 ) are marked by blue lines in the top row according to their genomic position on chromosome 14. The lead FEV 1 SNP, rs1022464, is highlighted in yellow. CpG sites associated with rs1022464 (FDR<0.05) are displayed as green markers below the genes, with cg03306306 highlighted in green. Chromatin Interactions containing SNPs associated with FEV 1 (p<1x10 -5 ) are highlighted in blue. Magenta arcs highlight interactions between the PPP1R13B promoter and regions containing FEV 1 SNPs and/or rs1022464-associated CpG sites. FEV 1 , forced expiratory volume in one second; SNPs, single nucleotide polymorphisms; meQTL, methylation quantitative trait locus; pcHi-C, promoter capture Hi-C.

The transcription start site of PPP1R13B resides 87 kb from rs10220464 and 152 kb from cg03306306, suggesting long-range interactions between the FEV 1 -associated genotype and the promoter of PPP1R13B. To determine whether any of the FEV 1 -associated GWAS variants at the TDRD9 locus resided in regions that physically interacted with the promoters of cis-genes, we evaluated chromatin interactions in lower airway (bronchial) epithelial cells (BECs) [ 72 ], assessed by promoter-capture Hi-C. Forty-two of the GWAS variants resided in regions that interacted with the promoters of 9 different genes expressed in NECs ( Fig 8 ; S5 Table ). The gene most frequently mapped to these variants was PPP1R13B, with 15 variants located in 3 different interaction loops. Moreover, the strongest observed interaction was between a region containing 4 FEV 1 -associated variants and the PPP1R13B promoter (CHiCAGO score = 9.38; S5 Table ), suggesting that this region is an enhancer for PPP1R13B expression. This putative enhancer region is located just 2.21 kb from cg03306306.

Trait-associated variants and DNA methylation often affect the transcriptome by influencing the expression of one or more neighboring genes [ 69 , 70 ]. Identifying these correlations can help infer causal mechanisms [ 71 ]. Therefore, we next explored the relationship between the genotype for the lead FEV 1 variant rs10220464 and the expression of genes within 1 Mb in NECs and PBMCs from URECA children. Notably, the rs10220464 genotype was not associated with TDRD9 expression levels in these cells (NECs: p = 0.60, β = 0.12; PBMCs: p = 0.91, β = 0.014). Of the 27 genes that were evaluated ( S4 Table ), rs10220464 was significantly associated with the expression of only one gene, PPP1R13B (Protein Phosphatase 1 Regulatory Subunit 13B; FDR q = 2.77.x10 -4 ; p = 1.3x10 -5 ; β = 0.12, 95% CI = 0.06–0.17; Fig 7A ), in NECs. PPP1R13B expression levels were also the most strongly associated of the 27 genes with methylation at cg03306306 in NECs (p = 0.018; β = 0.10, 95% CI = 0.02–0.18; Fig 7B ). PPP1R13B expression in NECs, however, was not associated with FEV 1 or smoking exposure ( S11 Fig ).

To determine if DNA methylation at the TDRD9 locus had a causal effect on lung function, we performed a Mendelian randomization analysis using two-stage least squares (2SLS) regression. In the first stage, cg03306306 methylation levels in NECs were regressed on an instrument composed of four meQTLs for cg03306306 (rs11160777, rs137961671, rs7143936, rs11160776; Materials and methods ). In the second stage, FEV 1 was regressed on the predicted DNA methylation values generated from the first stage regression, thereby yielding a causal effect estimate of cg03306306 methylation on FEV 1 . Urine cotinine levels were included as a covariate in both stages. The variance explained in the first stage regression was r 2 = 0.11. The causal effect of cg03306306 methylation on FEV 1 was statistically significant (p = 0.020). We also tested a single, unweighted allele score of the instrumental variables and observed a causal effect association of p = 0.045 (stage-one r 2 = 0.10). We further performed a bootstrapped mediation analysis to test whether the rs10220464 risk allele effect on FEV 1 was mediated by DNA methylation. The indirect effect of rs10220464 on FEV 1 via cg03306306 methylation was significant, both when including asthma as a covariate (β z = -0.04, 95% CI = -0.10- -0.003, percent mediated = 14.4%) and when asthma was not considered (β z = -0.04, 95% CI = -0.10- -0.002, percent mediated = 15.0%). These results indicate that the effect of the FEV 1 -associated genotype at the TDRD9 locus is partially mediated through its impact on nearby DNA methylation levels.

To determine if there was an interaction effect between genotype and smoking exposure on DNA methylation and/or lung function, we repeated the cotinine association tests in URECA with the addition of an interaction term to assess if the genotype effect differed between individuals with low and high exposures to smoking. There were no significant genotype-by-smoking exposure interaction effects on methylation levels in NECs or PBMCs in URECA, nor were there any significant methylation-by-smoking effects on FEV 1 ( S9 Fig ). There was modest evidence for a genotype-by-smoking exposure interaction effect on FEV 1 in the combined APIC and URECA sample, but this did not reach statistical significance (p = 0.06, S10 Fig ). Considering the ages of the participants in APIC and URECA, most tobacco exposures were likely due to secondhand smoke.

DNA methylation at the TDRD9 locus had previously been associated with maternal smoking during pregnancy [ 67 , 68 ]. Therefore, we tested for associations between environmental tobacco smoke exposure ( S8 Fig ) and DNA methylation at this locus in the URECA children. Methylation at cg03306306 in NECs was significantly associated with nicotine metabolite (cotinine) levels in urine collected at ages 7–10 years (p = 0.015; β = 0.03, 95% CI = 0.01–0.05; Fig 6 ). Methylation at cg03306306 in PBMCs from age 7 was not associated with urine cotinine levels.

We then analyzed cg03306306 methylation in PBMCs collected at age 7 (n = 169) [ 66 ] from URECA children to evaluate whether the genotype and lung function associations observed in NECs were shared with blood cells. In PBMCs, we observed no correlation between the rs10220464 risk allele and cg03306306 methylation ( Fig 5C ), nor was there an association between cg03306306 methylation and FEV 1 ( Fig 5D ). These results indicate that cg03306306 methylation dynamics in the airway epithelium are not present in peripheral blood cells.

The majority of complex trait-associated variants exert effects by altering gene regulatory networks [ 60 – 62 ]. These changes are often marked by quantitative differences in DNA methylation levels [ 63 – 65 ]. We therefore investigated correlations between the FEV 1 -associated allele at TDRD9 and DNA methylation at the locus in upper airway (nasal) epithelial cells (NECs) from URECA children at age 11 (n = 286). We tested for associations between the FEV 1 genotype, as tagged by rs10220464, and DNA methylation levels at 796 CpG sites within 10 kb of any TDRD9 locus variants associated with FEV 1 at p<1x10 -5 (n = 82 variants). The rs10220464 genotype was an meQTL for 5 CpG sites at an FDR <0.05 ( S3 Table ). DNA methylation levels at only one of these CpG sites, cg03306306 (p = 2.3 x10 -4 ; β = 0.07, 95% CI = 0.03–0.10; Fig 5A ), was also significantly associated with FEV 1 at age 10 in URECA (p = 0.011; β = -11.48, 95% CI = -20.27- -2.69; Fig 5B ). The rs10220464 genotype accounted for 4.7% of residual variation in cg03306306 methylation, and cg03306306 methylation explained 2.4% of residual variation in FEV 1 .

We examined association results for the previously identified FEV 1 and FEV 1 /FVC loci reported in the meta-analysis of the UK Biobank and SpiroMeta Consortium by Shrine and colleagues (n = 400,102) [ 23 ], which included 70 loci for FEV 1 and 117 for FEV 1 /FVC. Of these, 64 of the lead SNPs for FEV 1 and 112 for FEV 1 /FVC were genotyped in the APIC and URECA sample. Only one SNP, for FEV 1 , replicated with false discovery rate (FDR) q<0.05 (rs9610955; p = 1.0x10 -4 ; β z = -0.38, 95% CI = -0.58- -0.19; S6 and S7 Figs). Cumulatively, 56% (n = 36) and 54% (n = 60) of these SNPs demonstrated consistent directions of effect for FEV 1 and FEV 1 /FVC, respectively, with effect size correlations of 0.29 (95% CI = 0.05–0.50; p = 0.020) for FEV 1 and 0.42 (95% CI = 0.25–0.56; p = 4.2x10 -6 ) for FEV 1 /FVC.

A forest plot of the associations between rs10220464 and FEV 1 (% predicted) are shown for distinct sub-cohorts distinguished by self-identified race/ethnicity, study, and asthma status. β z , the association effect size between the rs10220464 allele count and the adjusted and normalized FEV 1 (% predicted) values; FEV 1 , forced expiratory volume in one second; N, total number of individuals included in the association test; MAF, minor allele frequency within the sub-cohort; P, the association p-value.

FEV 1 association results are shown at the TDRD9 gene locus. Each variant is plotted according to its position and -log 10 p-value, colored by linkage disequilibrium to the lead variant, rs10220464, within the sample. Candidate cis-Regulatory Elements (cCREs) from ENCODE [ 59 ] are also shown for the region. The inset panel in the upper right shows the distribution of adjusted FEV 1 values by rs10220464 genotype. FEV 1 , forced expiratory volume in one second; MAF, minor allele frequency; EnhD, distal enhancer-like signature; CTCF, CCCTC-binding factor sites; enhP, proximal enhancer-like signature; prom, promoter-like signature; K4m3, trimethylation of histone H3 at lysine 4.

Using the WGS variant calls for 14.1 million variants with minor allele frequency (MAF) ≥0.01, we performed a GWAS of two lung function traits: FEV 1 (% predicted) and FEV 1 /FVC (Z-scores), measured between ages 5–17 ( Table 1 , S2 Fig ), adjusting for age, sex, asthma diagnosis, the first 10 principal components (PCs) of ancestry, and sample relatedness using a linear mixed model [ 58 ]. The FEV 1 GWAS included 896 participants from APIC (n = 504) and URECA (n = 392), and the FEV 1 /FVC GWAS included 886 participants from APIC (n = 497) and URECA (n = 389). The genomic control factor, λ GC , for both GWAS results was 1.02 ( S3 Fig ), indicating adequate control for population stratification. We identified one locus on chromosome 14q32.33 that was associated with FEV 1 at genome-wide significance (p<2.5x10 -8 ); no other variants were associated with FEV 1 and no variants were associated with FEV 1 /FVC at genome-wide levels of significance ( Fig 2 ). The FEV 1 locus on chromosome 14 consisted of a 200 kb region of associated variants in high linkage disequilibrium (LD) across the TDRD9 (Tudor Domain Containing 9) gene ( Fig 3 , S2 Table ). The minor allele at the lead SNP (rs10220464; MAF = 0.30) was significantly associated with lower FEV 1 (p = 2.4x10 -9 ; β z = -0.31, 95% confidence interval (CI) = -0.41- -0.21) and nominally associated with lower FEV 1 /FVC (p = 1.1x10 -3 ; β z = -0.17, 95% CI = -0.28- -0.07). Fine-mapping analysis at this locus (chr14:103.7–104.3Mb) revealed one 95% credible set of effect variables consisting of 59 SNPs, with rs10220464 having the highest individual posterior inclusion probability among them ( S4 Fig ). We did not detect any significant differences in rs10220464 association effect size by ancestry or asthma status or study for FEV 1 ( Fig 4 ). Furthermore, the TDRD9 locus remained the only genome-wide significant association when the two GWAS were performed without adjustment for asthma status ( S5 Fig ). The overall effect size correlations between asthma-adjusted and unadjusted GWAS results were r = 0.981 for FEV 1 and r = 0.954 for FEV 1 /FVC.

A) The top two principal components (PCs) of ancestry are plotted for sequenced APIC & URECA participants, colored by self-identified race/ethnicity, along with the four ancestry reference populations used for determining ancestry. NS = not specified. B) The proportion of genetic variance explained by each of the top 10 PCs. C) The relative values of the top 10 PCs are plotted for each sample, colored by reference population. D) The estimated proportion of admixture from each ancestral population is shown for each sequenced APIC & URECA participant. Each vertical line corresponds to one sample. 1KG, 1000 Genomes project; HGDP, Human Genome Diversity Project; YRI, Yoruba in Ibadan, Nigeria; CEU, Utah residents with Northern and Western European ancestry; CHB, Han Chinese in Beijing, China; JPT, Japanese in Tokyo, Japan; NAT, Native Americans from HGDP; EAS, East Asian ancestry; AFR, African ancestry; EUR, European ancestry.

The sequenced cohort included 696 (67%) participants who self-identified as non-Hispanic Black and 258 (25%) who self-identified as Hispanic ( Table 1 ). Principal component and admixture analyses using genotypes were conducted to characterize the ancestry of the participants ( Fig 1 ). This revealed that the genetic ancestry of our sample was 66% African, 26% European, 7% Native American, and 1% East Asian. The cohort was 54% male and included 681 (66%) children diagnosed with asthma ( Table 1 ).

We completed WGS and variant calling on 1,035 participants from the APIC and URECA studies (APIC = 508, URECA = 527; Table 1 ). The mean sequencing depth was 31.6x per sample ( S1A Fig ). On average, 95.3%, 90.3% and 62.6% of each genome was mapped with at least 10x, 20x and 30x sequencing read depth, respectively ( S1B Fig ). Approximately 3.8 million high-confidence autosomal variants were called per sample. Variant call concordance between replicate sample pairs (n = 3) was >99.9% for single nucleotide polymorphisms (SNPs) and was 98.9% for insertions and deletions (InDels; S1 Table ).

Discussion

Using whole-genome sequence variant calls in an asthma-enriched cohort of predominantly African-American children raised in urban environments, we identified a genotype at the TDRD9 locus associated with lower FEV 1 % predicted. This genotype effect was partially mediated by DNA methylation in airway epithelial cells, which were also correlated with smoking exposure. Data from RNA-sequencing and promoter-capture Hi-C in airway epithelial cells suggested that these FEV 1 -associated genetic and epigenetic variations influence the expression of the PPP1R13B gene through long-range interactions.

The PPP1R13B gene encodes a protein that promotes apoptosis, a form of programmed cell death, via its interaction with the tumor suppressor p53 and is often referred to by its alias ASPP1 (apoptosis-stimulating protein of p53 1) [73]. In response to oncogenic stress, PPP1R13B translocates to the nucleus, where it enhances the transcriptional activity of p53 on specific target genes relevant to apoptosis [74,75]. Exposure to smoking and fine particulate matter induces epithelial apoptosis in the lung via p53 [76–78]. PPP1R13B may also promote apoptosis in a p53-independent manner by inhibiting autophagy in response to upregulation by EGR-1 (early growth response protein 1) [79]. EGR-1 mediates stress-induced proinflammatory responses in the airway epithelium and contributes to the pathogenesis of COPD [80–85]. Within the lung, PPP1R13B is indeed predominantly expressed in epithelial cells, particularly in alveolar type 2 cells, and less so in immune cells and fibroblasts [86,87]. However, Cheng and colleagues studied PPP1R13B function in lung fibroblasts and found that it was upregulated following SiO 2 exposure, where it promoted fibroblast proliferation and migration through endoplasmic reticulum stress and autophagy pathways [88]. Overall, these studies suggest that PPP1R13B plays a key role in maintaining tissue homeostasis by regulating apoptosis and autophagy in response to environmental stimuli [74,89,90]. The specific function(s) of this gene in the airway epithelium and its potential impact on the development of airway obstruction remain to be elucidated. PPP1R13B expression in airway epithelial cells at age 11 was not associated with lung function or urine cotinine levels in the URECA children, but the cofactors of this gene [79,91] have been found previously to be upregulated in smokers with COPD [81,92]. Given its association with lung function alleles in our study, its expression in the airway epithelium, and its purported functions in autophagy and apoptosis pathways, additional study of PPP1R13B in lung and airway development is warranted, particularly in the context of adverse environmental stimuli, many of which are enriched in low-income urban environments.

In NECs, PPP1R13B gene expression was significantly associated with DNA methylation levels at the cg03306306 CpG site in TDRD9. Methylation at the TDRD9 locus was previously reported to correlate with specific environmental exposures [67,68,93] and with TDRD9 expression in blood [67,94]. TDRD9 is lowly expressed in the lung but is detected in alveolar macrophages and in monocytes [86,87]. Interestingly, the gene was among the most differentially expressed genes in alveolar macrophages in smokers relative to non-smokers [95], and its knockdown in TDRD9-expressing lung carcinomas resulted in increased apoptosis [96]. Its expression was not correlated with the rs10220464 genotype in URECA NECs or PBMCs, but rs10220464 is an eQTL for TDRD9 expression in whole blood in GTEx data [97], with the minor allele associated with lower TDRD9 expression. Although evidence from this study points to PPP1R13B in the airway epithelium, we can’t exclude the possibility that TDRD9 or other genes could contribute to the locus’ influence on lung function via other tissues.

The FEV 1 association signal at the TDRD9 locus included many variants in high LD across a 200 kb region that could be independently contributing to function. Some of the variants lie in different long-range enhancers [59]. It is also possible that one or more correlated variants were not included because they failed quality control standards. In addition, due to the limited sample size of the WGS cohort, we excluded rare variants (MAF<0.01) from consideration, which could contribute to the signal at this locus. Additional functional studies are needed to identify the causal variant(s) and full mechanism of action.

The correlations of rs10220464, FEV 1 , and smoking exposure with cg03306306 methylation in NECs were absent in PBMCs. Although global DNA methylation patterns between tissues are highly correlated [98], tissue-specific differentially methylated regions are more likely to be functional, particularly if they are positively correlated with gene expression [99]. The TDRD9 locus has not been identified in epigenome-wide association studies of lung function [44,100–104], but these measured DNA methylation from blood, which may be an insufficient proxy for methylation in the lungs [105]. Indeed, previous studies have found that DNA methylation profiles in NECs are significantly more predictive of pediatric asthma than those in PBMCs [106,107]. Furthermore, epigenetic biomarkers can change with age. For example, epigenetic markers for lung function in adults do not replicate in children [101].

We tested for interactions between smoking exposure and rs10220464 genotype effects on cg03306306 and on FEV 1 and between smoking exposure and cg03306306 methylation effects on FEV 1 . We did not detect any significant interactions, but our analyses in that regard could have been underpowered given our observed effects and sample sizes [36]. Furthermore, because this study was limited to children living in low-income urban neighborhoods, environmental risk factors are likely to be more prevalent than in the general population [55–57]. Additionally, such exposures are not necessarily ubiquitous across all the different neighborhoods and communities represented in this sample, and although environmental tobacco smoke exposure was examined and the socioeconomic range represented in this study is relatively narrow, there could be relevant environmental factors that were not considered.

To infer causality, Mendelian randomization and mediation analyses rely on assumptions that are often difficult to empirically verify. For the Mendelian randomization analysis, we identified instrumental variants associated with the intermediate cg03306306 that were not independently associated with the outcome, FEV 1 . However, because these variants were selected from the same dataset that the outcome testing was performed in, they were susceptible to bias from winner’s curse [108]. To mitigate the potential impact from this effect and from weak instruments, we performed a secondary analysis in which we combined the instrumental variants into a single, unweighted score. For the mediation analysis, unmeasured confounding can invalidate direct and indirect effect estimates [109]. To protect against such bias, we systematically tested for confounding associations with additional environmental measures available in APIC and URECA (Materials and methods). Nonetheless, there may still exist unknown confounding factors that were not measured. Ultimately the results of the Mendelian randomization and mediation analyses indicate that methylation at cg03306306 in NECs mediated the rs10220464 genotype effect on FEV 1 , but there was residual correlation between rs10220464 and FEV 1 , signifying that the genotype effect was only partially mediated by cg03306306.

Another limitation of our study was the relatively small size for a GWAS. This likely contributed to the lack of statistically significant replication for previously identified lung function loci [23], considering that the observed effects were correlated with results of prior GWAS. However, the APIC and URECA cohorts represent understudied, high-risk, pediatric populations that likely harbor distinct genetic and environmental risk factors compared to older, primarily European ancestry cohorts included in previous GWAS of lung function [14–20,23]. The findings of this study have yet to be replicated in an independent cohort, and should therefore be considered preliminary; however, it is possible that these associations would differ in populations with dissimilar ancestry, age, exposures, and/or asthma risk.

There are additional caveats to consider when interpreting our findings. First, this study integrated data from two cohorts with different recruitment criteria, asthma definitions, and ancestral compositions. Furthermore, most of the analyses beyond the GWAS were limited to subsets of the URECA participants. However, we did not observe significant genetic effect heterogeneity for rs10220464 by study, asthma status, or ancestry. To control for potential population stratification, we used the first ten PCs of ancestry to adjust lung function values and then included the ancestry PCs as fixed effects in the GWAS models (Materials and methods). The linear mixed models also included a genetic relatedness matrix as a random effect to account for residual population structure. Because children with asthma have lower lung function overall (Table 1) and their lung function may be more affected by environmental exposures [110–112], we adjusted for asthma status in the GWAS, as in previous GWAS [113–116]. The likelihood of discovering lung function variants with consistent effects in asthmatics and non-asthmatics was thereby increased, although genetic determinants of lung function may differ by asthma status [117]. Furthermore, adjusting for disease status could potentially introduce collider bias [118]. The significant genotype effect at the TDRD9 locus, however, remained the only genome-wide-significant association when asthma was excluded as a covariate, and adjustment for asthma did not substantively alter the mediation results. Second, some of the analyses used data collected at different timepoints. For example, most of the urine cotinine and spirometry measures were collected at age 10, but the samples used for the NEC DNA methylation and RNA-seq analyses were collected at age 11. Because DNA methylation and gene expression can change over time [40,119–121], their values at age 11 may not be fully representative of exposures at age 10. Finally, the promoter-capture Hi-C data were from lower airway (bronchial) epithelial cells, whereas the DNA methylation and RNA-seq data were generated from upper airway (nasal) epithelial cells. Although there are transcriptomic differences between epithelial cells from each compartment, their respective profiles are highly correlated [122–126], and the use of NECs as a proxy for the lower airway epithelium has been validated for both gene expression and epigenetic studies [124–127].

Our study identified a novel avenue through which genetic risk and environmental exposures could affect the airways of children raised in low-income urban neighborhoods. Further research into this pathway may yield mechanistic insights into the early development of impaired lung function, perhaps leading to interventions that can help reduce the high incidence and morbidity of chronic respiratory diseases in socioeconomically disadvantaged children.

[END]
---
[1] Url: https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1010594

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/