(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .



Polygenic signals of sex differences in selection in humans from the UK Biobank [1]

['Filip Ruzicka', 'School Of Biological Sciences', 'Monash University', 'Clayton', 'Victoria', 'Luke Holman', 'School Of Biosciences', 'University Of Melbourne', 'Parkville', 'School Of Applied Sciences']

Date: 2022-09

Sex differences in the fitness effects of genetic variants can influence the rate of adaptation and the maintenance of genetic variation. For example, “sexually antagonistic” (SA) variants, which are beneficial for one sex and harmful for the other, can both constrain adaptation and increase genetic variability for fitness components such as survival, fertility, and disease susceptibility. However, detecting variants with sex-differential fitness effects is difficult, requiring genome sequences and fitness measurements from large numbers of individuals. Here, we develop new theory for studying sex-differential selection across a complete life cycle and test our models with genotypic and reproductive success data from approximately 250,000 UK Biobank individuals. We uncover polygenic signals of sex-differential selection affecting survival, reproductive success, and overall fitness, with signals of sex-differential reproductive selection reflecting a combination of SA polymorphisms and sexually concordant polymorphisms in which the strength of selection differs between the sexes. Moreover, these signals hold up to rigorous controls that minimise the contributions of potential confounders, including sequence mapping errors, population structure, and ascertainment bias. Functional analyses reveal that sex-differentiated sites are enriched in phenotype-altering genomic regions, including coding regions and loci affecting a range of quantitative traits. Population genetic analyses show that sex-differentiated sites exhibit evolutionary histories dominated by genetic drift and/or transient balancing selection, but not long-term balancing selection, which is consistent with theoretical predictions of effectively weak SA balancing selection in historically small populations. Overall, our results are consistent with polygenic sex-differential—including SA—selection in humans. Evidence for sex-differential selection is particularly strong for variants affecting reproductive success, in which the potential contributions of nonrandom sampling to signals of sex differentiation can be excluded.

Funding: This work was supported by an Australian Research Council Discovery Project Grant FT170100328, to TC. ( www.arc.gov.au ) The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Data Availability: All relevant code is available on the following public github repositories ( /filipluca/polygenic_SA_selection_in_the_UK_biobank/ and /lukeholman/UKBB_LDSC/ ) and all relevant data is available within the manuscript, Supporting Information files, and at https://zenodo.org/record/6824671 .

Copyright: © 2022 Ruzicka et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Here, we extend [ 32 , 34 ] and develop new statistical tests based on F ST metrics of between-sex allele frequency differentiation to detect polygenic signals of sex-differential selection affecting viability, reproduction, and total fitness during a full generational cycle. Applying these tests to the UK Biobank [ 55 ]—a dataset comprising quality-filtered genotype and offspring number data for approximately 250,000 men and women—reveals polygenic signals of sex-differential and SA polymorphism. We corroborate these results by using mixed-model statistics that explicitly control for systematic differences in the genetic ancestry of female and male individuals. We minimise potential sequencing artefacts and further show that sex-differentiated polymorphisms are preferentially situated in functional, phenotype-altering genomic sequences. Finally, we use genetic diversity data to examine modes of evolution affecting sex-differentiated sites.

Despite these challenges, new datasets and analytical approaches provide opportunities to identify robust genomic signals of SA selection. First, massive “biobank” datasets, which are widely used in human genomics, sometimes include both genotype and offspring number data [ 29 , 55 ] that can be used to detect loci with SA effects on reproductive components of fitness [ 32 ]. Second, estimates of allele frequency differences between sexes—though ill-suited for confidently identifying individual SA loci affecting viability—may nevertheless be amenable to genome-wide tests for polygenic SA viability selection [ 32 , 34 ]. Third, population genomic metrics of sex-differential selection (e.g., between-sex F ST ) may include an appreciable proportion of genuine SA loci in the upper tails of their distributions, providing a set of candidate loci that can collectively yield insights into the general properties of SA polymorphisms (e.g., their functional characteristics and evolutionary dynamics), despite uncertainty about individual candidates.

Although there is now abundant evidence that SA polymorphisms contribute to phenotypic variation, efforts to identify and characterise SA alleles in genomic data face 2 formidable challenges [ 32 ]. First, methods using explicit fitness measurements to identify SA polymorphisms (e.g., genome-wide association studies (GWAS) of fitness [ 33 ]) are rarely feasible, because it is challenging to obtain fitness measurements for large numbers of genotyped individuals under natural conditions [ 2 ]. Second, methods using allele frequency differences between adult females and males as genomic signals of SA viability selection (e.g., between-sex F ST estimates [ 32 , 34 – 43 ]) are limited in several ways: They have low power to detect SA loci, they cannot distinguish SA selection from sex differences in the strength of selection, they are susceptible to artefacts generated by population structure and mis-mapping of sequence reads to sex chromosomes [ 32 , 40 , 41 , 44 ], and they neglect fitness components other than viability, such as reproductive success [ 32 , 45 ]. Previous studies of human genomic data [ 32 , 34 – 36 , 43 , 44 , 46 ] have been affected by one or more of these issues, such that we currently lack robust evidence of SA genomic variation in humans. More generally, these impediments help to explain the limited catalogue of SA polymorphisms across species [ 47 – 49 ], which currently comprises a handful of loci with exceptionally large phenotypic effects (e.g., [ 50 – 54 ]).

Sexually antagonistic (SA) genetic polymorphisms—in which the alleles that benefit one sex are harmful to the other—are a type of genetic trade-off that may be common in sexually reproducing species [ 19 ]. Theory shows that SA polymorphisms are likely to arise when mutations differentially affect trait expression in each sex or when mutations similarly affect traits under divergent directional selection between the sexes [ 20 ]. Empirical quantitative genetic studies imply that both conditions are frequently met in nature [ 21 – 24 ] and, accordingly, that SA polymorphisms contribute to phenotypic variation in a range of plant and animal populations (e.g., [ 25 – 27 ]), including humans [ 28 – 31 ].

Adaptation of a population to its environment requires heritable genetic variation for fitness [ 1 ]. Although many populations show substantial genetic variation for fitness components [ 2 ]—including life history traits such as maturation rate, lifespan, mating success, and fertility [ 2 , 3 ]—genetic trade-offs between components or between different types of individuals in a population, limit adaptive potential [ 4 ]. For example, a mutation that increases the probability of survival to adulthood might simultaneously decrease adult reproductive success (e.g., [ 5 ]), weakening the mutation’s net fitness effect [ 4 ]. In addition to slowing adaptation [ 6 – 8 ], genetic trade-offs can increase standing genetic variation [ 2 , 9 ], give rise to balancing selection [ 10 , 11 ], and favour evolutionary transitions between mating systems [ 12 , 13 ], modes of sex determination [ 14 ], and genome structures [ 15 – 18 ].

Results

Genomic signals of sex differences in selection: Theoretical predictions Previous studies have examined sex-differential effects of genetic variation during the zygote-to-adult stage by comparing allele frequencies between adult females and males [32,34,36–40,44]. By contrast, our analytical approach combines allele frequency with offspring number data to estimate sex-differential effects during a full generational life cycle (Fig 1). To illustrate the approach, consider a large, well-mixed population containing many polymorphic, biallelic, autosomal loci. At fertilisation, mendelian inheritance equalises allele frequencies between the sexes (Fig 1, left box). In the zygote-to-adult stage, loci with sex-differential effects on survival accumulate allele frequency differences between the adults of each sex (e.g., the black allele becomes enriched in adult males and deficient in adult females because it improves zygote-to-adult survival in males but reduces it in females; Fig 1, middle box). Among the adults, alleles with sex-differential effects on reproductive success have different transmission rates to the next generation from surviving females versus surviving males (e.g., the black allele is enriched among the male gametes contributing to fertilisation but deficient among female gametes, thus increasing its transmission to offspring of males but decreasing transmission to offspring of females; Fig 1, right box). PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 1. Partitioning signals of sex differences in selection among fitness components. A pair of autosomal alleles are represented by white and black dots, representing female- and male-beneficial alleles, respectively; , and depict sex-specific frequency estimates for a given allele at different stages of the life cycle (see main text for details). Autosomal allele frequencies are equalised between sexes at fertilisation (left box; females, top; males, bottom), resulting in negligible allele frequency differentiation at this stage of the life cycle. Differentiation between sexes can arise in the sample of adults (middle box) due to sex differences in viability selection among juveniles (orange arrow) and in the projected gametes (right box) due to sex differences in LRS among adults (green arrow). Data on sex-specific allele frequencies and LRS thus allow the estimation of sex-differential effects of genetic variants on each fitness component (including overall fitness; purple arrow), despite the absence of allele frequency data among zygotes (left box) and gametes (right box), which are inferred and not directly observed. LRS, lifetime reproductive success. https://doi.org/10.1371/journal.pbio.3001768.g001 Adult allele frequencies, coupled with offspring number data per individual, thus provide an opportunity to estimate sex-differential effects of genetic variation during a complete life cycle, even though zygotic and gametic allele frequencies are inferred and not directly observed. Below, we apply our approach to the UK Biobank, a dataset that includes genotypes and reported offspring numbers (hereafter “lifetime reproductive success” or LRS, following standard terminology [29]) among putatively post-reproductive adults (ages 45 to 69 after filtering; see Materials and methods). For a biallelic autosomal locus with alleles A 1 and A 2 , we denote and the respective estimated frequencies of the A 1 allele in adult males and females of the UK Biobank. The projected frequencies of A 1 in paternal and maternal gametes contributing to fertilisation are: (1A) (1B) where M ij and F ij represent the cumulative LRS of males and females, respectively, with genotype ij (e.g., M 11 , M 12 , and M 22 correspond to genotypes A 1 A 1 , A 1 A 2 , and A 2 A 2 ). Using F ST [56], we partition between-sex allele frequency differentiation over 1 generation into 3 components: (i) differentiation among adults, which includes effects of sex-differential survival (hereafter “adult F ST ;” see [32,34,45]); (ii) sex-differential variation in adult LRS (hereafter “reproductive F ST ”); and (iii) sex-differential variation in overall fitness (hereafter “gametic F ST ”). Single-locus estimates of adult, reproductive, and gametic F ST are defined, respectively, as: (2A) (2B) (2C) where and .

F ST distributions in the absence of sex-differential selection In the absence of sex differences in selection (e.g., under neutrality or under sexually concordant (SC) selection of equal magnitude and direction in each sex), with large sample sizes, negligible Hardy–Weinberg deviations at birth, and excluding single-nucleotide polymorphisms (SNPs) with very low minor allele frequencies, we show that the adult, reproductive, and gametic metrics converge, respectively, to the following distributions: (3A) (3B) (3C) where each X is an independent chi-square random variable with 1 degree of freedom, N f and N m denote adult sample sizes, μ f and μ m denote mean LRS, and denote variances in LRS, and and quantify sex-specific departures from Hardy–Weinberg equilibrium in the sample of adults (Section A in S1 Appendix). In datasets such as the UK Biobank, there is also between-site variation in the number of genotyped individuals and the extent of Hardy–Weinberg deviations in the adult sample. The null distributions described by Eqs [3A–3C] are easily adjusted to account for this between-site variation (see Materials and methods). Relative to the null distributions in Eqs [3A–3C], sex differences in selection inflate each metric (Section A in S1 Appendix). These inflations may arise due to polymorphisms under sex-differential selection and neutral polymorphisms that hitchhike with selected polymorphisms. However, linkage disequilibrium (LD) alone cannot inflate genome-wide in the absence of genuine selected polymorphisms (Section B in S1 Appendix). As such, inflations represent reliable signals of sex-differentially selected polymorphism [32], provided: (i) technical artefacts are controlled (as shown below); (ii) sex-specific population structure is controlled; and (iii) males and females are sampled at random (though (iii) is not a requirement for reproductive ; see Discussion). To simplify the presentation, we first present analyses using F ST metrics, but we return to non-F ST metrics in the section titled “Controlling for sex-specific population structure.”

Genomic signals of sex differences in selection: Empirical data UK Biobank SNP data. The sample size in the UK Biobank, after removing individuals that were closely related, had a recorded ancestry other than “White British,” or had missing LRS data, was N = 249,021 (N m = 115,531 males and N f = 133,490 females). We removed rare polymorphic sites (MAF < 1%), sites with low genotype or imputation quality, and sites with high potential for artefactual between-sex differentiation based on criteria identified by Kasimatis and colleagues [44] (i.e., between-sex differences in missing rates, deficits of minor allele homozygotes, and heterozygosity levels exceeding what can be plausibly be explained by sex differences in selection; see Section C in S1 Appendix). Reassuringly, none of the 8 sites that Kasimatis and colleagues [44] identified as false positives for sex-differential viability selection appear among the quality-filtered, LD-pruned, imputed SNPs (N = 1,051,949) that are the focus of our analyses.

Forms of sex-differential selection: Theoretical predictions The elevations reported above indicate the presence of polygenic sex-differential selection in the UK Biobank. However, the signals could have arisen because of SA selection, because of sex differences in the strength but not the direction of selection (i.e., sex-differential SC selection), or a combination of both scenarios. To partition signals affecting LRS into SA and SC components, we examined the effects of a given allele on LRS in each sex relative to the other. Specifically, estimates of the product should tend to be negative when alleles have SA effects and positive when alleles have SC effects (Fig 3A). A new metric, termed “unfolded reproductive , ” provides a standardised measure of the product of sex-specific effects on LRS: (4) PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 3. Partitioning signals of sex-differential selection into SA and SC components reveals their joint contributions. (A) As in Fig 1, , and depict sex-specific frequency estimates for a given allele at different stages of the life cycle. Under SA selection (top), the white allele is female-beneficial and the black allele is male-beneficial, which tends to generate negative values of unfolded reproductive . Under SC selection (bottom), the black allele is beneficial in both sexes, which tends to generate positive values of unfolded reproductive . (B) Percentage of sites (turquoise: observed; grey: permuted) falling into each of 100 quantiles of the theoretical null distributions of unfolded reproductive . Theoretical null data (x-axes) were generated by simulating values (nSNPs = 1,051,949) from the null (i.e., the product of 2 standard normal distributions). In the absence of sex-differential selection, approximately 1% of observed SNPs should fall into each quantile of the null (dashed line). LOESS curves (±SE) are presented for visual emphasis. (C) Difference, for unfolded reproductive , between the mean observed and empirical null data (top) and between observed and theoretical null data (bottom), across 1,000 bootstrap replicates. The vertical line intersects zero, indicating no difference between the observed and null data. Differences between observed and null data were obtained separately for negative and positive values of unfolded reproductive . This illustrates that there is enrichment of SNPs in both tails of the null. The code and data needed to generate this figure can be found at https://github.com/filipluca/polygenic_SA_selection_in_the_UK_biobank and https://zenodo.org/record/6824671. SA, sexually antagonistic; SC, sexually concordant; SNP, single-nucleotide polymorphism. https://doi.org/10.1371/journal.pbio.3001768.g003 In the absence of any selection on LRS, unfolded reproductive is distributed as the product of 2 independent, standard normal distributions (i.e., symmetrically distributed with a mean of zero; see Section E in S1 Appendix). SA selection generates an excess of loci in the lower quantiles of this null model, while SC selection generates an excess of loci in the upper quantiles of the null. Note that sex differences in SC selection are not required to generate an excess of positive values for unfolded reproductive (SC selection of equal magnitude in the sexes can generate it as well), but SA selection is required to generate an excess of negative values.

Forms of sex-differential selection: Empirical data As with previous metrics, we calculated unfolded reproductive (Eq [4]) and contrasted it against its theoretical and empirical null distributions—the latter generated by a single random permutation of LRS among the individuals of each sex. Doing so revealed that both SC and SA sites contribute to the polygenic signal of sex-differential selection affecting LRS. As predicted under SC selection, we observed an enrichment of sites in the upper quantiles of the null distributions of unfolded reproductive (mean among sites with > 0; theoretical null: 0.637; permuted null: 0.640; observed: 0.694; Wilcoxon and Kolmogorov–Smirnov tests, p < 0.001; Fig 3B and 3C). As predicted under SA selection, we observed a smaller but significant enrichment of sites in the lower quantiles of the null (mean among sites with < 0; theoretical null: –0.635; permuted null: –0.638; observed: –0.651; Wilcoxon and Kolmogorov–Smirnov tests, p < 0.001; Fig 3B and 3C).

Functional and phenotypic effects of sex-differentiated loci If sex-differentiated loci reflect genuine sex-differential selection—rather than random chance, genotyping errors, or population structure—such polymorphisms should be preferentially found in functionally important regions in the genome. We therefore conducted enrichment tests, both to support our inference that sex-differential selection is occurring and to explore functional effects of sex-differentiated loci. We first used LD score regression [57] to test whether sites with high sex-differentiation tend to be found in major functional categories in the genome (coding, 3′UTR and 5′UTR regions). If a given category is enriched for genuine selected SNPs, the expected heritability tagged by these SNPs (i.e., what LD score regression measures) should exceed the fraction of SNPs present in that functional category. While functional enrichment estimates were noisy and thus not statistically distinguishable from 1 (no enrichment) after multiple-testing correction (Fig 5B), each estimate consistently exceeded 1 across functional categories and metrics, suggesting that sex-differentiated loci are more likely to have phenotype-altering effects than expected by chance. Further evidence for the phenotype-altering effects of sex-differentiated loci was sought through direct comparisons between metrics of sex-differential selection and the Neale laboratory database of UK Biobank GWAS. Specifically, we used cross-trait LD score regression [58] to estimate genetic correlations between metrics of sex-differential selection and 30 phenotypes, chosen for their medical relevance and/or relationship to phenotypic sex differences. Though many significant associations did not survive multiple testing correction (Fig 5C), several disease-relevant and quantitative traits (age at menarche, body fat percentage, diseases of the eye and adnexa, fluid intelligence, injury, neuroticism score, SHBG [sex hormone binding globulin], standing height) represent candidates for sex-differential viability and LRS selection, while other traits (testosterone, high blood pressure) represent candidates for sex-differential viability selection.

[END]
---
[1] Url: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001768

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/