(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org.
Licensed under Creative Commons Attribution (CC BY) license.
url:
https://journals.plos.org/plosone/s/licenses-and-copyright
------------
Highly conserved and cis-acting lncRNAs produced from paralogous regions in the center of HOXA and HOXB clusters in the endoderm lineage
['Neta Degani', 'Department Of Biological Regulation', 'Weizmann Institute Of Science', 'Rehovot', 'Yoav Lubelsky', 'Rotem Ben-Tov Perry', 'Elena Ainbinder', 'Department Of Life Sciences Core Facilites', 'Igor Ulitsky']
Date: 2021-10
Long noncoding RNAs (lncRNAs) have been shown to play important roles in gene regulatory networks acting in early development. There has been rapid turnover of lncRNA loci during vertebrate evolution, with few human lncRNAs conserved beyond mammals. The sequences of these rare deeply conserved lncRNAs are typically not similar to each other. Here, we characterize HOXA-AS3 and HOXB-AS3, lncRNAs produced from the central regions of the HOXA and HOXB clusters. Sequence-similar orthologs of both lncRNAs are found in multiple vertebrate species and there is evident sequence similarity between their promoters, suggesting that the production of these lncRNAs predates the duplication of the HOX clusters at the root of the vertebrate lineage. This conservation extends to similar expression patterns of the two lncRNAs, in particular in cells transiently arising during early development or in the adult colon. Functionally, the RNA products of HOXA-AS3 and HOXB-AS3 regulate the expression of their overlapping HOX5–7 genes both in HT-29 cells and during differentiation of human embryonic stem cells. Beyond production of paralogous protein-coding and microRNA genes, the regulatory program in the HOX clusters therefore also relies on paralogous lncRNAs acting in restricted spatial and temporal windows of embryonic development and cell differentiation.
Each of the four Hox clusters in vertebrate genomes encodes up to 11 transcription factors whose activity is extensively regulated spatially and temporally, and which help determine the developmental and adult transcriptome in space and time. These Hox transcription factors belong to 13 homology groups, and Hox clusters also encode various noncoding transcripts, including microRNAs and long noncoding RNAs (lncRNAs). We characterize in detail two lncRNAs, HOXA-AS3 and HOXB-AS3, which are transcribed from matching regions in the HOXA and HOXB clusters, respectively. These lncRNAs are highly conserved in vertebrate evolution and transcribed antisense to Hox protein-coding genes from groups 5–7. Beyond the matching positions, the promoters of HOXA-AS3 and HOXB-AS3 share sequence similarity, their expression patterns are correlated with each other, mostly in the endoderm lineage, and they positively regulate the expression of the Hox protein-coding genes that they overlap. Regulation by lncRNAs thus appears to be an ancestral feature of HOX clusters, likely pre-dating the duplication of the Hox clusters at the root of the vertebrate lineage.
Funding: This study was funded by grants from the US-Israel Binational Science Foundation (grant Number 2015171), Minerva Foundation, Israel Science Foundation grant 1242/14 and European Research Council grant lincSAFARI, all to IU. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Copyright: © 2021 Degani et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
The protein-coding genes in the four vertebrate Hox clusters belong to 13 groups of orthologs that can be traced to ancestral clusters that existed before the two rounds of genome-duplication [ 13 ]. The two conserved microRNA families encoded in the Hox clusters, miR-10 and miR-196, are represented in multiple clusters [ 14 ]. lncRNAs have been described in each of the four clusters but so far there were no known cases of clear similarity between lncRNAs across clusters. Here, we focus on a pair of lncRNAs that appear to be some of the most conserved lncRNAs produced from the vertebrate Hox clusters–HOXA-AS3 and HOXB-AS3. We provide evidence that it is likely that the production of these lncRNAs precedes the duplication of the ancestral Hox cluster into HOXA and HOXB. Both lncRNAs are expressed predominantly in the embryo, with expression patterns more similar to each other than to nearby protein-coding genes. In the adult, HOXA-AS3 expression is mostly restricted to tissues of endodermal lineage, and specifically to immature goblet cells and tuft cells. The similar expression of HOXA-AS3 and HOXB-AS3 is likely driven by conserved and shared binding sites for CDX transcription factors in the HOXA-AS3 and HOXB-AS3 promoters. Using human cell lines and human embryonic stem cells, we show that perturbation of HOXA-AS3 and HOXB-AS3 expression results in corresponding changes in expression of HOX-6 and HOX-7 genes. These results suggest co-ordinated and ancient lncRNAs production from central regions of the Hox clusters that plays important cis-acting gene regulatory roles in cells of the endodermal lineage.
Noncoding RNAs are likely to play important roles in Hox gene regulation. For example, Hox clusters encode two conserved miRNAs, miR-10 and miR-196, that target some of the Hox genes and help establish specific regulatory programs in the embryo [ 8 , 9 ]. One of the first lncRNAs that has been studied in detail, HOTAIR, is produced from the HOXC cluster and was reported to regulate expression of HOXD genes [ 10 ]. Since this seminal discovery, numerous lncRNAs have been implicated as important in the Hox gene regulation [ 11 ]. For example, HOTTIP, a lncRNA is located at the 5’ end of the HOXA cluster, was shown to control activation of 5’ HOXA genes in cis via cooperation with an MLL histone methyltransferase complex and chromosomal looping that brings it into close proximity with 5’ HOXA gene loci [ 12 ].
Mouse and human Hox genes are organized in four genomic clusters (HOXA to HOXD) that exhibit a unique mode of transcriptional regulation–temporal and spatial collinearity–the position of the genes along the chromosome roughly corresponds to the time and place of their expression during development. The sequential activation of Hox genes in the primitive streak helps determine the subsequent pattern of expression along the anterior–posterior axis of the embryo [ 5 , 6 ]. Despite the crucial importance of Hox genes during development [ 7 ], the molecular pathways that dictate their collinear expression are not fully understood.
Over the past decade, genome-wide transcriptome analyses revealed a plaetora of noncoding RNAs, that are expressed from a large number of genomic loci. Among those non-coding genes are long noncoding RNAs (lncRNAs), RNA Pol2 products that are longer than 200 nt. Similarly to mRNAs, lncRNAs begin with a 5’ cap and end with a poly(A) tail. To date, thousands of lncRNAs have been reported in different vertebrates [ 1 , 2 ], and it is yet unknown how many of them are functional and what is the full extent of their biological diversity. Many lncRNAs display highly restricted expression profiles during development, potentially allowing them to control gene expression in specific cellular contexts [ 2 , 3 ]. Some lncRNAs have been shown to indeed contribute to proper embryonic development [ 4 ].
Results
A pair of conserved lncRNAs in the middle of HOXA and HOXB clusters The central regions of HOX clusters give rise to a large variety of transcription products that undergo extensive alternative splicing (S1A Fig). We first focused on HOXA-AS3, the main transcription start site of which lies ~700 nt downstream of the annotated 3’ end of HOXA5 and which is transcribed antisense to HOXA5 and HOXA6, terminating in the single intron of HOXA7 (Figs 1A and S1A). The region in the mouse genome that aligns to the HOXA-AS3 promoter is the promoter of Hoxaas3 (2700086A05Rik), which terminates in the intergenic region between Hoxa6 and Hoxa7 (Fig 1A). The promoter of HOXA-AS3 is highly conserved in other vertebrates, but transcripts originating from it are not consistently annotated, likely due to its very restricted expression in adult tissues, as it is expressed predominantly in the embryo (see below). Using available RNA-seq data we could identify orthologs for HOXA-AS3 in opossum and X. tropicalis (Figs 1A and S1). Transcription of these orthologs, similarly to that of the human HOXA-AS3, started ~500 nt downstream of the 3’ end of HOXA5 and ended in the intron of HOXA7. HOXA-AS3 exhibited significant sequence similarity with the orthologs from mouse, opossum, and X. tropicalis (BLAST E-value<10−40). Notably, homology with the X. tropicalis ortholog was restricted to the region overlapping HOXA7. PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 1. Orthologs of HOXA-AS3 and HOXB-AS3 in different vertebrate species. Transcript models annotated by Ensembl, Refseq, or PLAR [61], or manually reconstructed based on RNA-seq data (see S1 Fig) for HOXA-AS3 (A) and HOXB-AS3 (B) are shown alongside the annotated protein-coding genes in the locus. The lncRNAs are transcribed from the ‘+’ strand and all other genes are transcribed from the ‘-’ strand. The regions of HOXA-AS3 and HOXB-AS3 are shaded.
https://doi.org/10.1371/journal.pgen.1009681.g001 HOXB-AS3 transcription in human starts ~900 nt downstream of the 3’ end of HOXB5 and terminates in the intergenic region between HOXB6 and HOXB7 (Figs 1B and S1A). Presumably because of its broader expression compared to HOXA-AS3, orthologs of HOXB-AS3 were readily identifiable in more species. In mouse, it is annotated as Hoxb5os (0610040B09Rik), and we could identify orthologs in opossum, X. tropicalis, coelacanth, spotted gar, medaka, and elephant shark (Figs 1B and S1). HOXB-AS3 exhibited significant sequence similarity with the orthologs from mouse and opossum (BLAST E-value<10−40), but not with more distant species. Comparison of the sequences with LncLOOM [15] identified four motifs conserved in mammals and in X. tropicalis but no deeper conservation was detected (S1 Dataset). Both HOXA-AS3 and HOXB-AS3 show negative PhyloCSF [16] scores throughout the locus (S2 Fig), and so it is unlikely that they encode highly conserved proteins. Notably, a primate-specific protein has been recently found to be encoded by HOXB-AS3 [17] (see Discussion). The corresponding positions of the two lncRNAs and the high conservation of their presence in other species made us scrutinize and compare the sequences of their promoters. BLAST comparison of the corresponding promoters from HOXA and HOXB clusters found significant homology in representative vertebrate species all the way to the cartilaginous fish elephant shark (E-value 6e-31 in human, 5e-32 in mouse, 1e-21 in Xenopus, 73–80% base identity). Mapping the transcription start sites of HOXA-AS3 and HOXB-AS3 transcripts based on RNA-seq data (where available) suggested that the precise position of transcription initiation varies between the clusters and to a lesser extent between the species (S3A Fig). Among the highly conserved sequences preserved in both classes, we note a pair of tandem binding sites for the CDX1/2 proteins—CCATAAA and CCATTAAA [18] that appear once on the sense and once on the antisense strand. When considering all the human promoters annotated in FANTOM5.5 [19], the HOXA-AS3 promoter contained 11 predicted CDX binding sites, a number of predicted binding sites larger than that in 99.95% of human promoters annotated in FANOM5.5 (only 83 of the 200K promoters had 11 predicted sites or more). HOXB-AS3 promoter had two predicted sites, a number comparable to that of several other Hox genes (S3B Fig).
HOXA-AS3 and HOXB-AS3 RNA products are required for their cis-regulatory activity In order to differentiate between the potential effects on chromatin caused by the use of the KRAB effectors and the transcription or the RNA products of HOXA-AS3 and HOXB-AS3, we used RNAi to target the RNA products of HOXA-AS3 and HOXB-AS3. First we transfected siRNA pools targeting HOXA-AS3 or HOXB-AS3 into HT-29 cells. This resulted in a substantial reduction in RNA levels for both HOXA-AS3 and HOXB-AS3 and a concomitant reduction in the expression of neighboring genes that was similar to the effects observed with CRISPRi (Fig 4A and 4B). When HOXA-AS3 was reduced by 60%, HOXA5/6/7 were significantly downregulated by 20–45% (Fig 4A). Similarly, when HOXB-AS3 was reduced by ~40%, there was a significant downregulation of HOXB5 and HOXB6 (Fig 4B). As an alternative approach, a stably expressed shRNA targeting HOXA-AS3 introduced via a lentiviral infection led to a stronger effect with the same trend as that observed using CRISPRi and siRNA, where KD of the lncRNA was accompanied by a decrease of expression of the neighboring genes (Fig 4C). HOXA-AS3 and HOXB-AS3 RNA products are therefore important for regulation of the adjacent genes. PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 4. RNA products of HOXA-AS3 and HOXB-AS3 are required for regulation of their adjacent Hox genes. (A-C) qRT-PCR measurements of the indicated genes in HT-29 cells treated with the indicated reagents. Normalized to Actin. n = 4 for siHOXA-AS3 and siHOXB-AS3. n = 3 for shHOXA-AS3. *—P<0.05, **—P<0.005, ***—P<0.0005. Two-sided t-test compared to the transfection control. (D) Changes in gene expression in RNA-seq data of HT-29 cells treated with HOXA-AS3 or HOXB-AS3 siRNAs. Shown are HOXA-AS3, HOXB-AS3, and all other HOX genes with average FPKM≥1. Asterisks indicate adjusted P<0.05 as computed by DESeq2.
https://doi.org/10.1371/journal.pgen.1009681.g004 In order to characterize more broadly the consequences of down-regulation of HOXA-AS3 and HOXB-AS3, we used RNA-seq to profile transcriptome-wide gene expression in HT-29 cells treated with siRNAs targeting these lncRNAs or with a non-targeting control. RNAi resulted in reduction in expression of the lncRNAs, concomitantly with reduction in the overlapping genes, and a broad mild reduction in expression of genes in the HOXA and HOXB clusters (HOXC and HOXD clusters are mostly silent in HT-29 cells) (Fig 4D and S2 Table), with more significant effects observed in the HOXB cluster that is overall more expressed than HOXA in HT-29 cells (S7A Fig). In the case of HOXB-AS3 it was apparent that the KD had a strong effect on the levels of the overlapping HOXB5–7 genes relative to the other HOX genes. The repressive effect of KD of HOXA-AS3 on HOXB genes, and of HOXB-AS3 KD on HOXA genes was validated by qRT-PCR following siRNA KD or CRISPRi of these genes (S7A and S7B Fig) These results suggest that loss of HOXA-AS3 and HOXB-AS3 has broad effects on expression of genes from HOXA and HOXB clusters. Beyond the effect on the expression of HOX genes, HOXB-AS3 had a larger effect on gene expression (S7C Fig), consistently with its higher expression levels in HT-29 cells. Analysis of the gene expression changes using GOrilla [32] (S2 Table) showed that HOXA-AS3 KD was associated with a significant reduction in genes related to cell cycle and proliferation (top down-regulated GO category “mitotic cell cycle process” adjusted P = 1.52×10−6), consistent with its reported positive effect of proliferation reported in other cell lines [27,33,34] (see Discussion). HOXB-AS3 led to a significant up-regulation of genes whose protein products are involved in ncRNA processing, and specifically in rRNA processing (adjusted P = 5.92×10−5), potentially related to its reported functions in rRNA biogenesis observed in leukemia cells [35]. The changes in gene expression outside of the HOX clusters following HOXA-AS3 or HOXB-AS3 KD could result from the consequences of changes in gene expression or from additional trans-acting functions of these lncRNAs (see Discussion).
HOXA-AS3 is localized in the both the nucleus and cytoplasm of HT-29 cells We next focused on HOXA-AS3 and characterized its precise expression pattern at higher resolution, as it is more narrowly expressed compared to HOXB-AS3, and also has a longer exonic sequence which permits the use of Stellaris smFISH protocol with 96 exonic probes for the human HOXA-AS3 and 94 for the mouse Hoxaas3 (S3 Table), whereas only 24 probes were possible for HOXB-AS3.We first analyzed the subcellular localization of HOXA-AS3 and HOXA5 in HT-29 cells (Fig 5A). We observed variable expression of both genes among cells, in some of the cells we could detect expression of only one of the transcripts, while others expressed both genes. HOXA-AS3 transcript was detectable in just ~15% of the >100 imaged cells, in up to 3 foci per cell and with localization mainly in the nucleus, though it could also be detected in the cytoplasm. Interestingly, in some of the cells that express both HOXA-AS3 and HOXA5 we detected a rare yet highly specific co-localization in the perinuclear area (Fig 5B). As expected from their genomic co-location, HOXA-AS3 and HOXA5 are co-localized in what is likely their site of transcription in the nucleus (Fig 5B). PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 5. Single-molecule FISH detection of HOXA-AS3 and HOXA5 in HT-29 cells. (A) HOXA-AS3 (red) and HOXA5 (green) transcripts in a sample of HT-29 cells. Scale bar: 10 μm. (B) HOXA-AS3 and HOXA5 are co-localized at their presumed site of transcription. (C) HOXA-AS3 and HOXA5 are occasionally co-localized in the perinuclear area (white arrow). (D) HOXA-AS3 and HOXA5 are occasionally expressed separately.
https://doi.org/10.1371/journal.pgen.1009681.g005
HOXA-AS3 is expressed in a specific subset of colon epithelial cells As HT-29 cells contain a mixture of cellular states from the colon epithelium [28,29], HOXA-AS3 expression in a small subset of cells may imply that it is only found in a defined subpopulation of cells. We therefore analyzed the expression pattern of HOXA-AS3 and Hoxaas3 in normal intestinal epithelial cells, using single-cell RNA sequencing (scRNA-seq) data. In scRNA-seq data from the human colon scRNA-seq data, HOXA-AS3 was expressed predominantly in epithelial cells, and within those it was detected specifically in tuft and immature goblet cells, that are deep crypt goblet cells that are part of the stem cell niche [36] (Fig 6A). Similarly, in the mouse small intestine [37] HOXA-AS3 is mainly expressed in tuft cells at comparable expression levels to the tuft marker Dclk1 (Fig 6B). In contrast, in the mouse colon scRNA-seq Hoxaas3 is mainly detected in goblet cells (Fig 6C). In order to examine expression in intact tissue, we performed smFISH for Hoxaas3 in the jejunum of the mouse small intestine, which contains a relatively high fraction of goblet cells, and compared it to smFISH of the goblet cell marker Gob5, the tuft cell marker Dclk1, and Atoh1 marking intestinal secretory precursor cells, including immature goblet and tuft cells. Based on the marker expression and the positions of the cells, we conclude that Hoxaas3 is expressed in the early immature goblets and in the secretory precursor cells (Fig 6D). Hoxaas3 and Hoxa5 were occasionally co-localized, similar to the observations in HT-29 cells (Fig 6D). PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 6. Expression of HOXA-AS3 in the human and mouse gut. (A) Expression of HOXA-AS3 in single cells of the human colon (data from [62]). (B-C) Expression of the indicated genes in scRNA-seq from the mouse small intestine (B) and colon (C). Data from [37]. (D) smFISH of Hoxaas3, Hoxa5, Gob5 and Atoh1 expression in the mouse intestine. Scale bar:10μm. Arrows indicate a subset of RNA molecules detected in the images.
https://doi.org/10.1371/journal.pgen.1009681.g006 scRNA-seq and smFISH from both human and mouse samples thus supports the notion that HOXA-AS3 is expressed in a specific subpopulation, which may explain the apparently variable expression pattern that we observed in HT-29.
HOXA-AS3 and HOXB-AS3 are induced during early differentiation of human embryonic stem cells towards endoderm As both HOXA-AS3 and HOXB-AS3 were more highly expressed in embryonic stages compared to adult tissues, we next wanted to evaluate the expression and activities of HOXA-AS3 and HOXB-AS3 during early developmental transitions. Endoderm is one of the three primary germ cell layers, and endoderm patterning is controlled by a series of reciprocal interactions with nearby mesoderm tissues. As development proceeds, broad gene expression patterns within the foregut, midgut, and hindgut become progressively refined into precise domains from which specific organs will arise. Human embryonic stem cells can be differentiated towards endodermal cell lineages in a robust manner, resulting, within seven days, in three different populations–anterior foregut (AFG), posterior foregut (PFG) and midgut/hindgut (MHG), using a protocol established by Loh et al. [38] (S8A–S8C Fig). During this differentiation process a graded, spatially collinear Hox gene expression is observed, after in-vitro patterning, whereby PFG cells express 3’ anterior Hox genes (e.g. HOXA1) and MHG cells express 5’ posterior Hox genes (including HOXA10) [38] (S8A Fig). Pluripotent hESCs and cells from each stage of the differentiation were validated by multiple markers (S4 Table) using qRT-PCR (S8D Fig) and by immunostaining (S8E Fig), matching the expression patterns observed in the RNA-seq data from [38] (Fig 7A), HOXA-AS3 and HOXB-AS3 were strongly induced and expressed only in the MHG population, alongside their adjacent HOX-6 and HOX-7 genes, whereas HOXA5 and HOXB5 were alse expressed in PFG cells (Fig 7A). PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 7. Function of HOXA-AS3 and HOXB-AS3 during endodermal differentiation of hESCs. (A) Read coverage in RNA-seq data from [38] for shown parts of the HOXA (top) and HOXB (bottom) clusters. In each cluster all the tracks are normalized together. (B-C) Expression levels estimated by qRT-PCR for the indicated genes in hESCs following CRISPRa-mediated 48h activation of HOXA-AS3 (n = 7/4) (B) and HOXB-AS3 (n = 3) (C). (D-E). Expression levels estimated by qRT-PCR in MHG cells following CRISPRi-mediated repression of HOXA-AS3 (n = 3) (D) and HOXB-AS3 (n = 3) (E). (F) Changes in expression of the indicated genes following infection of hESCs with two separate HOXA-AS3 shRNAs, followed by differentiation to MHG. n = 6. *—P<0.05; **—P<0.005; ***—P<0.005. Two sided t-test. Errors bars—SEM.
https://doi.org/10.1371/journal.pgen.1009681.g007
[END]
[1] Url:
https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1009681
(C) Plos One. "Accelerating the publication of peer-reviewed science."
Licensed under Creative Commons Attribution (CC BY 4.0)
URL:
https://creativecommons.org/licenses/by/4.0/
via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/