(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .
Single-nucleus RNA-sequencing in pre-cellularization Drosophila melanogaster embryos [1]
['Ashley R. Albright', 'Department Of Biochemistry', 'Biophysics', 'University Of California', 'San Francisco', 'California', 'United States Of America', 'Michael R. Stadler', 'Department Of Molecular', 'Cell Biology']
Date: 2022-07
Abstract Our current understanding of the regulation of gene expression in the early Drosophila melanogaster embryo comes from observations of a few genes at a time, as with in situ hybridizations, or observation of gene expression levels without regards to patterning, as with RNA-sequencing. Single-nucleus RNA-sequencing however, has the potential to provide new insights into the regulation of gene expression for many genes at once while simultaneously retaining information regarding the position of each nucleus prior to dissociation based on patterned gene expression. In order to establish the use of single-nucleus RNA sequencing in Drosophila embryos prior to cellularization, here we look at gene expression in control and insulator protein, dCTCF, maternal null embryos during zygotic genome activation at nuclear cycle 14. We find that early embryonic nuclei can be grouped into distinct clusters according to gene expression. From both virtual and published in situ hybridizations, we also find that these clusters correspond to spatial regions of the embryo. Lastly, we provide a resource of candidate differentially expressed genes that might show local changes in gene expression between control and maternal dCTCF null nuclei with no detectable differential expression in bulk. These results highlight the potential for single-nucleus RNA-sequencing to reveal new insights into the regulation of gene expression in the early Drosophila melanogaster embryo.
Citation: Albright AR, Stadler MR, Eisen MB (2022) Single-nucleus RNA-sequencing in pre-cellularization Drosophila melanogaster embryos. PLoS ONE 17(6): e0270471.
https://doi.org/10.1371/journal.pone.0270471 Editor: Anton Wutz, Eidgenossische Technische Hochschule Zurich, SWITZERLAND Received: January 12, 2022; Accepted: June 10, 2022; Published: June 24, 2022 Copyright: © 2022 Albright et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: Raw sequencing files are available on Data Dryad:
https://doi.org/10.6078/D13D9R All code is available here:
https://github.com/aralbright/2021_AAMSME. Funding: ARA was supported by an NIH Training Grant (T21 GM 007127) and the National Science Foundation Graduate Research Fellowship Program. MRS was supported by an American Cancer Society postdoctoral fellowship (126730-PF-14-256-01-DDC). The work was also supported by a Howard Hughes Medical Institute Investigator award to MBE. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: MBE is a founder and former member of the board of directors of PLOS. This does not alter our adherence to PLOS ONE policies on sharing data and materials.
Introduction Early animal development is largely driven by maternally-deposited RNAs and proteins. In Drosophila melanogaster, zygotic gene expression is detected as early as the 10th nuclear cycle; however, zygotic genome activation primarily occurs during the 14th nuclear cycle while maternal RNAs are degraded and cellularization begins [1, 2]. Much of the difficulty in understanding the regulation of early embryonic gene expression lies in the challenge to simultaneously capture expression level and patterning. Classic examples of patterned gene expression and regulation originate from in situ hybridizations [3–6], however the nature of in situ hybridizations does not allow for the study of many genes at once. In order to fully understand the regulation of gene expression across the genome, it is imperative that we establish new methods to examine changes in spatially-patterned genes. Prior work from our lab demonstrated the use of RNA sequencing in patterning mutants following cryosectioning embryos across the anterior-posterior axis [7]. This work benefits from knowledge of the origin of each slice during analysis; however, with this method we cannot truly resolve from where the RNAs originated as many nuclei will contribute to expression within each slice. Recent work from Karaiskos, Whale, et al (2017) demonstrated the use of single-cell RNA-sequencing in the early Drosophila embryo and the ability to construct virtual in situ hybridizations from prior knowledge of patterned gene expression [8]. Others have used single-cell RNA-sequencing in dorsoventral mutant embryos and showed depletion of an entire subset of cells [9]. These studies demonstrate the potential for single-cell RNA-sequencing to answer questions relating to pattern and body axis formation in the early Drosophila embryo; however, whether single-cell RNA-sequencing is sensitive enough to detect subtle changes in gene expression in mutant embryos lacking major defects remains unclear. To establish the use of single-nucleus RNA-sequencing in the early Drosophila embryo, we decided to examine gene expression in control, as well as maternal null dCTCF embryonic nuclei which are subsequently referred to as dCTCFmat-/-. Insulator elements were first described for their enhancer-blocking activity [10–12], and have since been shown to affect genome and chromosome structure as well [13–18]. Interestingly, mammalian CTCF serves as the only insulator protein in mammals; however, Drosophila and other arthropods have evolved several insulator proteins [19, 20]. The redundancy of Drosophila insulator proteins allows us to understand the many functions of insulators without causing cell lethality. Intriguingly however, dCTCF is not actually required for embryonic viability [21]. Previous reports indicate that loss of individual Drosophila insulator proteins yields minimal changes in gene expression [19, 22–25], but others show that dCTCF is required for correct expression of certain genes observed by in situ hybridizations in embryos and larvae [26, 27]. The observed changes are slight however, which may explain why large-scale defects in transcription are not observed with RNA-sequencing in flies lacking dCTCF. Using 10x Genomics, we assayed gene expression across over 8,000 nuclei from control and dCTCFmat-/- embryos. Overall, the nuclei tend to cluster according to expression of spatially-patterned genes, indicating that the nuclei retain information regarding their position in the embryo prior to dissociation. This allows us to understand genome-wide expression in spatial regions of embryos prior to cellularization by sequencing, which was previously only possible by slicing embryos [7, 13]. As expected considering the viability of dCTCFmat-/- embryos, we found fewer differentially expressed genes in bulk than in individual clusters. We also found several candidate patterned genes that may be differentially expressed in certain clusters but not in bulk. Our analyses are available in a reproducible and usable format (see Code Availability) allowing others to explore our data analysis as well as analyze other genes of interest not explored here. Altogether this work establishes the use of single-nucleus RNA-sequencing in the early Drosophila embryo to detect subtle changes in gene expression and encompasses a resource to explore candidate locally differentially expressed genes upon loss of maternal dCTCF.
Methods CRISPR Maternal dCTCF nulls were created by using CRISPR mutagenesis to insert a dsRed protein followed by two consecutive stop codons immediately upstream of the dCTCF open reading frame. The homologous replacement template plasmid was constructed using a pUC19 backbone and ~1 kb homology arms generated by PCR (5’ homology arm primers: CCACAAAGAAACGTTAGCTAGTTCC and TCCTATGGACAAATTGGATTTGTTTTGG, 3’ homology arm primers: CCAAGGAGGACAAAAAAGGACGAG and CGTGAGTGGCGCGTGATC). Repair template was coinjected into Cas9-expressing embryos (Rainbow Transgenic Flies, Camarillo, California), along with two guide RNAs (ATTTGTCCATAGGAATGCCA, TGTCCATAGGAATGCCAAGG) expressed from a U6:3 promoter on a modified version of the pCFD3 plasmid [28]). Resulting flies were crossed to flies containing chromosome 3 balancer chromosomes, and screened by genotyping PCR. Putative hits were further screened by PCR and sequencing of the entire locus using primers outside the homology arms (CATTAGAATTCAAGGGCCATCAG and CACTTGAAGGATGGCTCG). A successful insertion line was recombined with an FRT site on chromosome 3L at cytosite 80B1 (Bloomington stock # BL1997). Fly husbandry All stocks were fed standard Bloomington food from LabExpress and maintained at room temperature unless otherwise noted. We used the FLP-DFS (dominant female sterile) technique [29] to generate dCTCFmat-/- embryos. First, we crossed virgin hsFLP, w*;; Gl*/TM3 females to w*;; ovoD, FRT2A(mw)/TM3 males (Bloomington Drosophila Stock Center ID: 2139). From this cross, we selected hsFLP,w*;; ovoD, FRT2A(mw)/TM3 males and crossed them to virgin CTCF*,FRT2A/TM3 females. Larvae from this cross were heat-shocked on days 4, 5, and 6 for at least two hours in a water bath at 37°C. Upon hatching, virgin hsFLP, w*/+; CTCF*, FRT2A(mw)/ovoD*, FRT2A(mw) females were placed into a small cage with their male siblings. Flies were fed every day with yeast paste (dry yeast pellets and water) spread onto apple juice agar plates. These crosses were conducted simultaneously with another insulator protein, and control embryos were collected from the ovoD line used to generate those germline clones (Bloomington Drosophila stock Center ID: 2149). Western blots Flies laid on grape-agar plates for two hours and embryos were either aged two hours at room temperature or taken directly after collection. Embryos were dechorionated with bleach, rinsed, and frozen in aliquots of ~25 embryos at -80 C. Embryos were homogenized in 25 μl RIPA buffer (Sigma cat # R0278) supplemented with 1 mM DTT and protease inhibitors (Sigma cat # 4693116001) using a plastic pestle. After homogenization, samples were mixed with 25 μl 2x Laemmli buffer (Bio-Rad # 1610737EDU), boiled for 3 minutes, and spun at 21,000 x g for 1 minute. Samples were loaded onto Bio-Rad mini Protean TGX 4–20% gels (# 4561096) and run at 200V for 30 minutes. Protein was transferred at 350 mA for one hour to Immobilon PVDF membrane (Millipore-Sigma # IPVH00010). Blots were blocked for one hour in PBST (1x PBS with 0.1% Tween) with 5% nonfat milk, and stained with primary antibodies (courtesy of Maria Cristina Gambetta [27], 1:1000 in PBST with 3% BSA) for one hour. Blots were then washed 3 times for 3 minutes rotating in PBST and probed with an HRP-conjugated anti-Rabbit secondary antibody (Rockland Trueblot, # 18-8816-33, 1:1000 in PBST with 5% milk) for one hour. After extensive washing with PBST, blots were developed with Clarity ECL reagents (Bio-Rad # 1705060) and imaged. Validation of the loss of maternal dCTCF is shown in S1 Fig. Nuclear isolation and sequencing Nuclei were isolated from early to mid-nuclear cycle 14 embryos (stage 5) according to several previously published works [30–32]. First, the cages were cleared for 30 minutes to 1 hour to remove embryos retained by the mothers overnight, followed by a 2 hour collection and 2 hour aging. Then, the embryos were dechorionated in 100% bleach for 1 minute, or until most of the embryos were floating, with regular agitation by a paintbrush. The embryos were transferred to a collection basket made of a 50 mL conical and mesh. After the embryos were rinsed with water, the embryos were transferred into an eppendorf tube containing 0.5% PBS-Tween. From this point forward, samples were kept on ice to prevent further aging of embryos. A minimum of 9 early to mid-nuclear cycle 14 embryos were sorted using an inverted compound light microscope and transferred to a 2 mL dounce containing 600 uL of lysis buffer (10 mM 10 mM Tris-HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 1% Bovine Serum Albumin, 1% RNase Inhibitor (Enzymatics, Part Num. Y9240L)) + 0.1% IGEPAL. The embryos were homogenized 20 times with a loose pestle and 10 times with a tight pestle. Pestles were rinsed with 100 uL lysis buffer + 0.1% IGEPAL after use. The resulting 800 uL of buffer and nuclei were transferred into an eppendorf tube, filtered with a 40 uM filter. Nuclei were pelleted by spinning for 5 minutes at 900 g and 4°C, washed in 500 uL lysis buffer (without 0.1% IGEPAL), and pelleted again before resuspending the nuclei in 20 uL lysis buffer (without 0.1% IGEPAL). Nuclei concentration was then adjusted to 1000 uL nuclei per uL, then nuclei were barcoded with the 10X Chromium Single Cell 3’ Gene Expression Kit (v3). Control and dCTCFmat-/- nuclei were processed on separate days, then sequenced together with the Illumina NovaSeq (SP flow cell). Data processing and analysis We used kallisto-bustools [33] to generate a custom reference index and generate a nucleus x gene matrix. The data were analyzed in both Python and R, using primarily scVI via scvi-tools [34, 35], scanpy [36], and custom scripts for analysis. Control and dCTCFmat-/- nuclei were filtered separately as follows: (1) nuclei were ranked by the number of UMIs detected and nuclei ranked below the expected number of nuclei (10,000) were removed; (2) nuclei with fewer than 200 expressed genes were removed; (3) nuclei with greater than 5% mitochondrial expression were removed; (4) nuclei with greater than 50,000 UMI counts were removed; (5) genes expressed in fewer than 3 nuclei were removed. Prior to batch correction, the data were subset to the 6000 most highly variable genes using scanpy’s dCTCFmat-/- based on log1p normalized expression. We ran scVI with gene_likelihood = ‘nb’ to correct for batch effects. The nuclei were clustered using the Leiden algorithm [37] within scanpy and visualized on a 2D UMAP [38]. Prior to batch correction, nuclei were clustered on log1p normalized gene expression. After batch correction, nuclei were clustered on the latent space derived from the scVI model. Marker genes representing each cluster were found using the sc.tl.rank_genes_groups function from scanpy with the Wilcoxon signed-rank test. In situ hybridizations of representative marker genes were obtained from the Berkeley Drosophila Genome Project[39–41]. Colors representing Leiden clusters were projected onto a virtual embryo using novoSpaRc [42, 43]. Log2 fold change and associated p-values were obtained for each gene using diffxpy (
https://diffxpy.readthedocs.io/). Statistically significant differential expression was determined following Bonferroni correction of the p-values and filtered for adjusted p-value less than 0.05 and absolute value of log2 fold change greater than or equal to 1.5. Intersecting sets of differentially expressed genes were found and visualized with an UpSet plot [44, 45], following correction of adjusted p-values for the number of comparisons (multiplied by 11; 10 for the total number of clusters + 1 to include bulk differential expression). Data and code availability Raw sequencing data and.h5ad files are available on DataDryad:
https://doi.org/10.6078/D13D9R. Much of our analysis originated from work by Booeshaghi and Pachter (2020) [46] and Chari et al (2021) [47], with the addition of custom scripts. All of the code used in the analysis and in generating the figures is available here:
https://github.com/aralbright/2021_AAMSME. Single-nucleus data pre-processing, batch correction and clustering, virtual in situ hybridization, and differential expression analyses are available in this GitHub repository as Google Colab notebooks. These notebooks are available for anyone to run from a web browser with the option to enter any genes of interest not discussed in this manuscript.
Discussion We conducted the above analyses in order to determine whether we could use single-nucleus RNA-sequencing as a means of understanding the regulation of gene expression in the early Drosophila embryo. First, we show that nuclei can be grouped into clusters represented by distinct gene expression. Then, we show that representative marker genes from the majority of the clusters recapitulate known patterns of expression. Importantly, we also present examples of potential differential expression of patterning genes in individual clusters upon loss of maternal dCTCF, but not in bulk. Prior to this work, studies towards our understanding of the regulation of patterned gene expression in a spatial context included cytoplasmic RNAs in measures of expression. We must acknowledge the caveat that we do not know the extent to which maternal RNAs enter the nucleus and some of our results may reflect the presence of both maternal and zygotic RNAs. Nonetheless, we believe that single-nucleus RNA-sequencing is better suited as opposed to bulk RNA-sequencing to understand changes in gene expression in pre-cellularization embryos upon the loss of important developmental factors because of the ability to resolve local changes in expression. Supporting this notion, single-cell RNA-sequencing has already shown to resolve the loss of an entire cell fate in cellularized dorsoventral mutant embryos [9]. Whether or not the changes in gene expression that we observed have implications in embryonic development related to the loss of dCTCF is unclear without further investigation, such as single-molecule FISH to validate the observed changes in gene expression of particular RNAs. Ultimately, using single-nucleus RNA-sequencing to examine changes in gene expression upon the loss of important developmental factors has the potential to uncover perturbation responses previously undetected by bulk RNA-sequencing.
Acknowledgments We are grateful for Maria Cristina Gambetta and the generous sharing of Cp190 and dCTCF antibodies. Thank you to Dr. Justin Choi and the UC Berkeley Functional Genomics Laboratory as well as the Vincent Coates Sequencing Laboratory for processing our samples. We would also like to thank Sina Booeshaghi, Tara Chari, and Lior Pachter of the Pachter Lab at Caltech for their feedback and discussions on single-nucleus RNA-sequencing analysis. We thank Colleen Hannon and Marc Singleton and all members of the Eisen lab for helpful discussions and advice throughout preparation of the manuscript.
[END]
---
[1] Url:
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0270471
Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.
via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/