(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .



A new human embryonic cell type associated with activity of young transposable elements allows definition of the inner cell mass [1]

['Manvendra Singh', 'Max-Delbrück-Center For Molecular Medicine In The Helmholtz Society', 'Berlin', 'Max Planck Institute Of Multidisciplinary Sciences', 'City Campus', 'Göttingen', 'Aleksandra M. Kondrashkina', 'Thomas J. Widmann', 'Genyo', 'Centre For Genomics']

Date: 2023-07

There remains much that we do not understand about the earliest stages of human development. On a gross level, there is evidence for apoptosis, but the nature of the affected cell types is unknown. Perhaps most importantly, the inner cell mass (ICM), from which the foetus is derived and hence of interest in reproductive health and regenerative medicine, has proven hard to define. Here, we provide a multi-method analysis of the early human embryo to resolve these issues. Single-cell analysis (on multiple independent datasets), supported by embryo visualisation, uncovers a common previously uncharacterised class of cells lacking commitment markers that segregates after embryonic gene activation (EGA) and shortly after undergo apoptosis. The discovery of this cell type allows us to clearly define their viable ontogenetic sisters, these being the cells of the ICM. While ICM is characterised by the activity of an Old non-transposing endogenous retrovirus (HERVH) that acts to suppress Young transposable elements, the new cell type, by contrast, expresses transpositionally competent Young elements and DNA-damage response genes. As the Young elements are RetroElements and the cells are excluded from the developmental process, we dub these REject cells. With these and ICM being characterised by differential mobile element activities, the human embryo may be a “selection arena” in which one group of cells selectively die, while other less damaged cells persist.

Funding: Z.I. was funded by European Research Council, ERC Advanced [ERC-2011-ADG 294742]. L.D.H. is funded by European Research Council, ERC Advanced [ERC-2014-ADG 669207]. J.L.G.P´s lab is supported by CICE-FEDER-P12-CTS-2256, Plan Nacional de I+D+I 2008-2011 and 2013-2016 (FIS-FEDER-PI14/02152), PCIN-2014-115-ERA-NET NEURON II, the European Research Council (ERC-Consolidator ERC-STG-2012-309433), by an International Early Career Scientist grant from the Howard Hughes Medical Institute (IECS-55007420), by The Wellcome Trust-University of Edinburgh Institutional Strategic Support Fund (ISFF2) and by a private donation by Ms Francisca Serrano (Trading y Bolsa para Torpes, Granada, Spain). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Here then, we seek to define the cell types of the early human embryo more fully, with a particular focus on the activity of transposable elements. This we do via analysis of single-cell transcriptomic data, from visualisation of early human embryos and from experiments employing hESCs. We report a population of cells (replicated in independent datasets) that does not correspond to the prior “unspecified” transitory cells [ 2 ], these having heterogeneous expression of commitment markers. This new cell type is, we find, associated with the activity of Young REs, DNA damage, and apoptosis. As these Young elements are retroelements and the cells are developmentally excluded, we dub this new type REject cells. Visualisation and single-cell analysis agree that approximately 20% of the cells of the early embryo are such cells. Having defined these, we can better characterise ICM and define marker genes. We note that Radley and colleagues [ 4 ] using a state-of-art set of algorithms, recently acknowledged that they could confirm the same marker genes as we report (presented in the early release of the present paper [ 15 ]). We additionally find that this newly defined ICM has no apoptotic features and suppresses Young transposable elements. In accord with the view that host defence systems are often controlled by co-option of other mobile elements [ 16 ], we find that an older RE (HERVH) is associated with this suppression. We discuss what the association between Young REs and REjects and Old REs and ICM might imply.

In this context, we are especially interested in the activity of transposable elements in the early human. While about half of our genome consists of remnants of transposable elements [ 9 ], transposition of some Young retroelements (REs) (<7 MY, human–chimpanzee split), mobilised via retrotransposition, is possible [ 10 , 11 ]. That Young REs are transcriptionally active in early human embryos [ 12 , 13 ] is then intriguing. Whether this implies transpositional activity, however, is another issue. Nonetheless, to propagate to the next generation, REs must transpose either in germline or pre-germline cells. Thus, transpositional activity in the early human embryo would not be surprising. That the host evolves to suppress transposing elements [ 14 ] indicates an ongoing conflict between hosts and REs, consistent with insertion events generating both intra-clone diversity and fitness variation.

While expression of certain pluripotency genes and transcription factors all but define ICM (e.g., NANOG, GATA6), recent further characterisation additionally included apoptotic genes (e.g., in [ 6 ]), which is perhaps unexpected. More generally, programmed cell death, apoptosis, has been observed during the cleavage stage of human in vitro fertilisation (IVF) embryos (reviewed in [ 7 ]) and is common at the morula and blastocyst stages in other mammals [ 8 ]. In which cell types this occurs has yet to be discerned. Understanding the biology of such cells may allow us to understand why such apoptosis is happening.

In broad outline, we understand early human development. From the zygote, the embryo progresses through the 2-cell stage, to 4-cell, to 8-cell (E3), to morula (E4), and thence to blastocyst (E5 onwards). Within the blastocyst is the inner cell mass (ICM) that gives rise to the epiblast and thence to the embryo. Single-cell transcriptomic analysis [ 1 – 4 ] has permitted a clearer definition of many cell types, both expected and unknown, and their ontogenetic derivatives. For example, recent analysis reveals a population of cells with trophectoderm (TE) and epiblast markers (EPI) at E6 that give rise to primitive endoderm (PrE) [ 3 ]. Nonetheless, there remains much that is not fully resolved. Perhaps most importantly, the key cell type from which the embryo is derived, the ICM, has for long while remained less well resolved in both its ontogeny and its definition ([ 2 ], see also [ 3 , 5 ]). Indeed, for a considerable period, ICM was thought too transitory to capture and identify [ 2 , 5 ]. The definition is, however, a key goal, not least because it would enable a better understanding of human pluripotency and the comparative nature of laboratory model cells, human embryonic stem cells (hESCs).

Results

NCCs overexpress DNA damage response markers and apoptotic genes While the above suggests that NCCs have no evident developmental future, more definitive evidence for the lack of an ontogenetic future would be evidence for cell death. Compared to committed cell types, the top marker of NCCs is an apoptosis-inducing factor BIK (BCL2-Interacting Killer) (Fig 1D and 1E), accompanied by several other genes associated with programmed cell death (e.g., BAK1, BAX, and various caspases) (S3A Fig), suggesting that NCCs are likely to be eliminated from the developmental program owing to programmed cell death. For confirmatory caspase staining in human embryos see below. The NCCs also show hallmarks of DNA damage, a likely precursor to apoptosis. As regards DNA damage, we observe, specifically in NCCs, the up-regulation of pre-apoptotic genes and multiple DNA damage response genes, including TP53I3 and TFEB [19] (S3D and S3E Fig), indicative of the activation of TP53-associated DNA damage signalling. To provide independent evidence of DNA damage, we subjected the cells to γ-H2AX/NANOG co-staining of E5 blastocyst (Figs 2A and S4A). As NANOG marks pluripotent ICM/EPI cells, we expect few cells to be stained for both NANOG and γ-H2AX. As expected, pluripotent cells stained by NANOG do not show γ-H2AX staining, while a fraction of blastocyst cells that fail to express NANOG show DNA damage (Figs 2A and S4A). These results are consistent with the possibility that cells expressing high levels of DNA damage markers are those that do not fall in the pluripotency trajectory and are prone to being excluded from the developmental process [20]. In contrast, the undamaged cells express pluripotency factors and likely to form the ICM.

Evidence for transposition activity of Young REs and for apoptosis in the human embryo, but not in ICM While we do not discount the possibility of multiple mechanisms of RE-driven toxicity [20,22], we hypothesised that NCCs might be affected by, or correlated with, the insertion of REs [23], as seen in oogenesis [24,25] and neurogenesis [26]. In humans, transpositionally active REs include Long Interspersed Element class 1 (LINE-1 or L1/L1_Hs), and the non-autonomous SVA and Alu elements, mobilised by active L1 [10,11]. To examine this, we consider 2 approaches. First, we determine the expression and chromatin status of full-length “Hot” L1 elements. Second, we test for the presence of L1-encoded ORF1p protein in human embryos. There are 89 full-length intact L1 loci in the human genome, potentially expressing both ORF1 and ORF2, required for retrotransposition [27]. To determine whether the L1 expresses both ORFs, we first calculated the coverage of RNA-seq reads over the full-length L1s in E3, E4, and E5 embryos (S5B Fig) and found the alignment of RNA-seq reads spanning their entire length, suggesting that these L1 elements express both ORFs. To determine whether the expression is driven by the L1 promoter or it is a read-through, we analysed the available ATAC-seq data from human 8-cell and ICM, as well as DNAse-seq data from morula and blastocyst [28]. We observed the significant enrichment of ATAC-seq and DNAse-seq normalised signals over the promoter sequences of full-length L1s (S5B Fig). Overall, our analysis indicates that L1s express as full-length, including both ORFs (1 and 2), and that they are likely to be driven by their canonical promoters. Six of the 89 L1 loci are activated in many cancers and are deemed as the “ultra-hot” L1 elements [10]. We identified 4 of the 6 “ultra-hot” L1 elements as a potential source of L1 activity in early human embryos (S5C Fig). While the above analysis suggests L1 activity in morula and blastocyst, transcripts identified above may fail to provide functional translatable RNA. It is also unclear whether L1-ORFs are preferentially expressed in particular E5 blastocyst lineages. To resolve this, we performed co-immunostaining of the L1-encoded ORF1p and pluripotency factor POUF5F1/OCT4 that is expressed relatively highly in E5 blastocyst. We detect a robust expression of the ORF1p in a substantial fraction of E5 blastocyst cells (Fig 3A). As predicted, there is an inverse correlation between the expression of the ORF1p and the POUF5F1/OCT4. We find that OCT4+ stained cells are not stained for L1-ORF1p. The cells that show a high intensity of POU5F1/OCT4 staining are compacted near polar TE to form probable ICM, whereas L1-ORF1p stained cells are excluded from the developing ICM (Figs 3A and S5A). In contrast to ICM, L1-ORF1p expression is readily detectable in the trophectoderm (Fig 3A and S1 Movie, see also [29]) in which the cost to the organism is lower, TE being a transient structure (see Discussion). PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 3. LINE-1 expressing cells are excluded from the ICM. Code to generate these figures is at doi.org/10.5281/zenodo.7925199. (A) Representative confocal images show immunofluorescence staining in human early (E5) blastocysts with anti POU5F1/OCT4 (nuclear, green), L1-ORF1p (cytoplasmic granular, red), DAPI (nuclear, blue). Note: POU5F1+ cells are significantly enriched in the ICM (circled) and compacting near polar TE. A violin plot (upper left panel) visualises the density and expressional dynamics of the POU5F1 in pre-TE, NCC, and ICM at E5. Solid red dots represent the median, while quartiles are represented in the default pattern of boxplots inside the violin plots. Co-staining demonstrates the exclusive expression of POU5F1 and L1_ORF1p during the formation of blastocyst. The cells expressing higher POU5F1 compacting to form the ICM at the polar region of the blastocyst are less well stained for L1-ORF1p. L1-ORF1 stains scattered cells and pre-TE, not included in the compacted population of cells (arrows). L1 (LINE-1_Hs) belongs to a group of mutagenic, Young REs and supports transposition of both LINE-1 and the non-autonomous Alu and SVA elements. Magnification is 40×. See also S1 Movie. (Bottom panel) Numerical analysis of L1-ORF1p expression in POU5F1- vs. POU5F1+ cells in the E5 embryo. The graph shows the average number of L1-ORF1p cytoplasmic foci in POU5F1- and POU5F1+ cells, with standard deviation. Note: pre-TE cells were not considered for this analysis. (B) Representative confocal immunofluorescent images of human E5 blastocysts co-stained against cleaved caspase 3 (cl_Caspase3) (red), L1-ORF1p (green), and DAPI (blue); (brightfield, black-and-white panels). The depicted stage III apoptotic cell overexpresses the L1-ORF1p and marked by cl_Caspase3 and has a disintegrated nucleus (Stage II—narrower arrow). Note that while the expression of pro-apoptotic markers is fluctuating in the embryo, the L1-ORF1p and cl_Caspase3 co-staining could unambiguously mark the cells that both overexpress L1-ORF1p and apoptose (specific to NCC). Two representative experiments are shown. Magnification is 63× (left embryo). The framed section is zoomed out to show the co-stained cells. The Venn diagram shows the quantification of overexpressed L1-ORF1p/cl_Caspase3 marked cells at effective E5 (data from 4 independent human embryos); 380, 296, 92, 46, total number of cells from 4 embryos, L1-ORF1p+, cl_Caspase3+, L1-ORF1p+/cl_Caspase3+, respectively. See also [29]. Timing of the embryo is inferred from state of progression of the embryo as IVF embryos can have absolute timings different from classical. With the blastocyst still being formed, we infer this to E5 equivalent. (C) Phylogenetically young (<7 MY) and old (>7 MY) REs are antagonistically expressed in NCCs and the ICM. The scatterplot shows the comparison of normalised mean expression in CPM of various RE families between the averaged pool of ICM (x-axis) and NCC (y-axis) cells. Read counts per RE family are normalised to total mappable reads per million. Note: The top candidates are shown for both Young REs and Old REs. Young REs include LTR5_HS, AluY, SVA, L1_Hs that are human-specific and the HERVs are either specific to Hominoid or Eutherians. Uniquely mapped reads were considered as 1 alignment per read. Multimapping reads were considered as 1 alignment only if they were mapped to multiple loci, but exclusively within an RE family. Every dot corresponds to an RE family. RE families enriched in ICM (red) vs. NCC (blue). (D) Boxplot showing the distribution of averaged RE expression in ICM vs. NCCs. Note: 7 My distinguishes Old and Young REs (e.g., inserted before and after the split of human and chimp approximately 7 million years ago (Mya) [13,89]). (E) Boxplots showing the expression distribution of “Hot” L1 elements in the human embryonic development stages. Every dot represents a locus of “Hot” L1. (F) Combined boxplots and heatmaps showing distinct pattern of highly expressed transposable element families at day 3 (8-cell) and day 5 (bulk-ICM) (GSE101571) of human preimplantation embryogenesis. Note: SVA_D and HERVH-int are the most abundant REs in the transcriptome of 8-cell and bulk-ICM, respectively, and possess an opposite dynamic of expression. (G) RNA transcript intensity and density of differentially expressed RE families across the cell types of E4 and E5, following the subtraction of NCCs markers. The dot colour shows average expression and scales from blue to red, corresponding to lower and higher expression, respectively. The size of the dot is directly proportional to the percentage of cells expressing the REs in a given cell type. Note the RE expression can be considered as highly specific lineage markers. While HERVH-int is expressed both in morula (M1) and ICM, it is specifically driven by LTR7B and LTR7, respectively. CPM, counts per million; ICM, inner cell mass; NCC, not-characterised cell; RE, retroelement; TE, trophectoderm. https://doi.org/10.1371/journal.pbio.3002162.g003 While large-scale transcriptional up-regulation of REs might itself trigger apoptosis [30], we asked if the DNA damage, potentially induced by L1_Hs expression (e.g., transposition of L1_Hs and SVA/Alu), might correlate with the apoptotic process in NCCs. To see if the L1-overexpressing cells are associated with apoptosis, as the single-cell transcriptomic data would suggest, we performed a co-staining with antibodies to L1-ORF1p and an early apoptotic marker cleaved Caspase 3 (cl_Caspase3) (Fig 3B). While, L1-ORF1p marks both TrEs/NCCs, and pre-apoptotic gene expression has dynamic fluctuations in the embryo, only NCCs are expected to both overexpress L1 with a fraction of them being apoptotic at any given time. Quantifying the double stained cells provides an estimate of the timing and the number of NCCs. Indeed, we observed double positive cells after morula at E5 (6/6 embryos). At E5, we observed that up to approximately 20% of cells overexpress both L1-ORFp1 and cl_Caspase3 (Fig 3B) and that apoptotic cells showed evident signs of nuclear fragmentation. The approximately 20% estimate from confocal analyses of E5 co-stained embryos accords with the approximately 20% to 30% estimate derived from transcriptome analyses.

ICM expresses Old REs while NCC expresses Young ones Our data indicate that ICM avoids the expression of L1_Hs elements. To see a global picture of RE expression in ICM and its the sister lineage NCC, we analysed single-cell RNAseq data [1,2] and used averaged expression (Log2 CPM+1) of each RE family across the cell populations (Fig 3C). We observe a transcriptional pattern of REs that distinguishes the clusters of ICM from NCC shown on the DM (Fig 3C). We find that overexpressed REs in the NCC are relatively phylogenetically young (<7 MY, human–chimpanzee split) and include potentially mutagenic elements. In contrast, these same RE families are relatively quiet in ICM (Fig 3D). Activated Young REs in NCCs include potentially mutagenic retrotransposons [31], such as SVA_D/E, AluY (Ya5), L1_Hs (Fig 3C). In contrast, ICM showed the overexpression of relatively older REs some of which have known regulatory activities [13,32]. The Old REs, none of which are transpositionally competent, are dominantly represented by their full-length versions: LTR2B/ERVL18, LTR41B/ERVE_a, LTR17/ERV17, LTR10/ERVI, MER48/(H)ERVH48, and LTR7/(H)ERVH in ascending order of average expression (Fig 3C). Transcriptional activation of HERVH-int in ICM coincides with the expression of the regulatory LTR7 that provides a binding platform for several pluripotency factors NANOG, POU5F1, SOX2 [33–39], suggesting a potential contribution of its regulatory activity in human pluripotency.

RE-associated between-cell heterogeneity is identifiable at the morula stage If NCCs are the products of between-cell heterogeneity in RE activities, we might expect to see such heterogeneity a little before NCC formation with a subclass of cells expressing Young REs. Owing to the nature of the data (pooled from different samples but ascribed the same approximate timings), we cannot say whether between-cell heterogeneities occur at the same time (contemporary) or one after the other (sequential). Given this constraint, we aim to make no strong statements about the origin of NCCs prior to E5 which would require targeted analysis. Nonetheless, at E4 we identify 2 distinguishable cell clusters (S5D Fig). While LEUTX1 flags the 8-cell stage, the 2 clusters of human morula are marked by GATA3 and HKDC1, respectively (S5D Fig). Known blastocyst signature genes [2,40] (e.g., TFCP2L1/LBP9, ESRRB, DNMT3L) are expressed in all morula cells, but predominantly in M1 (S6A Fig). For diagnostic markers for each lineage, see (S6B Fig). Trajectory analysis including E3 to E5 allows speculation that morula-derived HKDC1 marked M2 cells may eventually accumulate in NCCs, whereas the committed M1 GATA3-positive cells are further trackable towards pre-TE (S6C Fig). If this is an early fate decision correlated with RE expression profiles, then we expect the Young REs to be seen in the M2 population or at least within morula or earlier. Consistent with activity of Young REs somewhere in morula, we observe the up-regulation of “hot” L1_Hs elements in morula (Fig 3E). We also see the antagonistic expression of the Young versus Old elements in (e.g., SVA_D/F and HERVH) between 8-cell stage and ICM (Fig 3F) indicative of activity of Young REs earlier rather than later in development. The M1, M2, ICM, NCC, and TE lineages are specifically marked also by the expression of distinct RE families (Fig 3G). While the most highly enriched REs are different in M2 and NCC, some Young REs are overexpressed in both M2 and NCC, such as SVA_D (Fig 3G). In both cell types, the most highly enriched REs are Young REs (both cell types have overexpressed representatives from AluY and SVA families). Old one’s however also have particular expression patterns: different LTR7 variants (LTR7and LTR7B) feature in morula and ICM, respectively (Fig 3G), indicating that different subfamilies of LTR7/HERVH are expressed at different stages of early development [38]. The clusters display indistinguishable expression of G1-S-G2/M cell cycle stage markers (S6D and S6E Fig) so are not cell cycle-mediated artefacts. We conclude only that there exists a subpopulation of cells at E4 that, like NCCs, is permissive for the expression of Young REs. These observations warrant further investigations to fully decipher whether morula already contains the lineage that precedes NCC.

[END]
---
[1] Url: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3002162

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/