(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .



Histologically resolved multiomics enables precise molecular profiling of human intratumor heterogeneity [1]

['Tao Chen', 'Biomedical Pioneering Innovation Center', 'Biopic', 'School Of Life Sciences', 'Peking University', 'Beijing', 'College Of Engineering', 'Chen Cao', 'Jianyun Zhang', 'Department Of Oral Pathology']

Date: 2022-07

We then quantified the similarity between morphological information retained in SRS and HE images using the histogram of orientation gradient (HOG) [ 31 ]. The HOG describes the texture of an image based on intensity variation, which primarily reflects cellular packing in our images. Unsupervised clustering of the HOG from all 8 images presented in Fig 1B , successfully clustered images by tissue type, regardless of imaging modality ( Fig 1C ). This analysis indicates that SRS images recovered similar cellular organization patterns as HE images. In addition to distinguishing different types of tissue, the ability to distinguish cancerous epithelium from normal epithelium has potential clinical implications. The cellular organization patterns between these tissue states are distinct and can be decoded by HOG features. Unsupervised clustering of the HOG features of 32 SRS images, including 16 cancerous and 16 normal epithelia ( S3 Fig ), stratified tissue images into 2 groups, accurately reflecting disease state with only 1 misidentification ( Fig 1D ). Furthermore, the correlation matrix of all 32 samples’ HOG features revealed a weaker correlation among cancers than epithelia ( S4 Fig ), implying greater variation in cellular organization patterns between individual cancer nests. This heterogeneity could be related to the distinct microenvironment of the cancer nests. For example, the cancer nests in P4 invaded heavily into muscle tissue, while in other patient samples the cancer nests were typically located in connective tissue ( S5 Fig ). We examined the morphology restoration at different scales on stitched image of a whole cryo-section (9 × 4 mm 2 ). Comparing with corresponding HE images, SRS revealed identical features at all scales. Selected ROIs from SRS images can be localized in HE images through correlation of their respective HOG (Figs 1E and S6 ). The key morphological features were identified in the large-scale SRS image depicted in Fig 1E , including normal epithelium, dysplasia, cancer, and those keratin pearls correlated with OSCC differentiation.

The recent and rapid development of SRS-based histology [ 14 – 17 ] has proven this technique to be a powerful supplement to traditional histological staining for many tissues. We first examined the ability of SRS imaging to reconstruct histological features in unstained cryo-sections of OSCC that corroborated information obtained with hematoxylin–eosin (HE) staining of OSCC sections. We generated 2-color SRS composite images of 30 μm cryo-sections biopsied from various tissues that provide contrast between lipid-rich and protein-rich structures by exciting CH 2 symmetric vibration and CH 3 vibrations, respectively (Materials and methods, S1 Fig ). To validate the efficacy of SRS-based histology for OSCC, we then compared these reconstructed images to adjacent 5 μm cryo-sections that were sampled from the same biopsies and conventionally stained by HE. The lipid–protein contrast in the SRS images recovered the characteristic morphological features of each tissue, including the variation of cell shape and base membrane in the epithelium, the epithelial cells forming the duct wall in the gland tissue [ 28 ] and the nerve bundles ( Fig 1B ). In addition to recapitulating morphological information, SRS images provide information about the chemical composition of the tissue samples. For example, both the muscles and the perineurium, a collagen-rich structure [ 29 , 30 ], around nerve revealed high protein content in SRS images ( Fig 1B ). Variability in the chemical composition of ROIs can be further evaluated by the histogram of the protein-to-lipid ratio (PLR) from different tissue types, reflecting a distinct molecular signature of the cells within the tissue. In order to confirm that the histopathological utility of SRS was not limited to OSCC, we also identified characteristic features of 2 other oral cavity diseases, the Warthin’s tumor and the mucoepidermoid carcinoma ( S2 Fig ). These results demonstrate the generalizability of SRS imaging for oral cavity histology.

Principal component analysis (PCA) separated microsamples into 4 groups of similar gene expression profiles that recapitulated the known tissue types ( Fig 2D ). Unsupervised hierarchical clustering of microsamples using the top 217 differently expressed genes (Materials and methods) grouped these clusters as distinct tissue types, 3 of which were then annotated using a panel of known marker genes for epithelium, gland, and muscle in agreement with the corresponding tissue types identified by SRS histology (Figs 2D and S10 ). The fourth group, which was consistently identified as cancer by SRS images, contained highly expressed OSCC marker genes including GSTP1, AKR1B10, FTH1, and FTL ( S11 Fig ). GSTP1 expression is known to be associated with high malignancy and poor survival rate [ 32 , 33 ]. Expression levels of GSTP1 were further validated with immunofluorescence staining, confirming the confinement of GSTP1 expression to cancer nests ( S12 Fig ). While the epithelial to mesenchymal transition (EMTs) related gene, such as KRT13 ( Fig 3A ) [ 34 ], was expressed at lower levels, indicating an enhanced migratory capacity and invasiveness [ 35 ]. While these OSCC samples originated from the epithelium, they revealed a distinct expression profile compared to the healthy epithelium samples. Unexpectedly, P4S2E, which was identified as epithelium by SRS imaging, clustered with the cancer samples based on gene expression profiles ( Fig 2E , red arrow). PCA also indicated that P4S2E was more similar to the cancer samples than the epithelium samples ( Fig 2D ). This observation indicates that microsample P4S2E may have come from a region of the biopsy in which these epithelial cells were developing into carcinoma. This level of molecular characterization is enabled by joint histological profiling with SRS and gene expression analysis.

We applied SMD-seq to 13 cryo-sections from 4 patients (3 males, 1 female) who suffered various stages of OSCC at different ages (Materials and methods). We collected 28 in situ microdissection samples for sequencing, and 27 samples passed the quality control based on mRNA recovery, for further analysis ( S1 to S3 Tables, Materials and methods). These microsamples included 12 normal (5 muscle, 2 gland, and 5 epithelium) and 9 cancer samples for RNA-seq and 13 normal (muscle, n = 5; gland, n = 2; and epithelium, n = 6) and 8 cancer samples for DNA-seq. RNA-seq recovered around 9,000 genes per microsample on average (FPKM > 0.1, Fig 2C ) and highly correlated expression among samples of the same tissue, indicating technical reproducibility ( S9 Fig ).

After recovery, each microsample was lysed and divided into 2 equal aliquots for parallel DNA and mRNA extraction. One of the primary advantages of SMD-seq is the efficient recovery of high-quality mRNA, which is made possible because SRS histology avoids fixation, staining, or any other chemical perturbations that may degrade RNA or inhibit recovery. Quantification of housekeeping genes showed that unstained cryo-sections can preserve over 20-fold more mRNA than HE-stained sections ( S8 Fig ). Another advantage of SMD-seq is that the implementation of whole-genome and whole-transcriptome amplification enables multiomic analysis from a single ROI. Finally, SRS histology and LCM enables molecular profiling on subregions of a tissue section so that cellular heterogeneity can be reduced and ensemble measurement averages over a smaller ROI. Comparing with HNSCC samples collected for the Cancer Genome Atlas, (TCGA), the microsamples measured with SMD-seq presented significantly lower cell-type diversity (P value = 0.0098, S7C–S7E Fig , Materials and methods). The roughly 100-μm resolution ROI subselection in SMD-seq allows for profiling of intrasection heterogeneity; however, the lack of single-cell resolution within the microsamples means that some heterogeneity is still obscured.

In the SMD-seq workflow, after an ROI is identified with SRS histology, that region is immediately excised and recovered from the bulk tissue section for downstream genomic analysis. This is achieved by increasing the power of the pump line of the SRS excitation source and scanning the focal spot around a user-defined ROI boundary (Materials and methods). To validate the accuracy of ROI dissection, we performed HE staining on the remaining tissue section after dissection, as well as adjacent tissue sections ( Fig 2A ). Because microdissection is performed with the scanning laser, the ROI is defined with high resolution ( S7A Fig ) and can be recovered immediately after imaging. Depending on the ROI size and tissue type, the dissected sample typically contained about 230 cells on average ( S7B Fig ). SRS imaging and dissection can be performed in rapid succession, allowing the dissection of multiple ROIs from a single tissue section ( Fig 2B ), therefore when profiling cancer regions, all the cancer microtissues were paired with a normal epithelial microtissue from the same tissue section to control for intersample variability.

SMD-seq profiles molecular heterogeneity in OSCC

SMD-seq provides the ability to compare the genome and transcriptome of specific tissue types and disease states within a single tissue section, allowing for a more precise deconvolution of genetic profiles and gene expression profiles than traditional bulk assays. While healthy tissue showed consistent gene expression across samples, OSCC microsamples revealed large patient-to-patient variability in gene expression. KLK8 and KRTDAP only exhibited high expression in patient P3 (Fig 3A). KLK8 is implicated in malignant progression of OSCC [36], and KRTDAP strongly correlate with the differentiation and maintenance of stratified epithelium. Correspondingly, “keratin pearls,” structures indicating high-degree differentiation were present in SRS images of P3 (S13 Fig) [37]. In addition to variability between patients, we found significant variability in gene expression between samples identified as cancer nests from the same patient. Specifically, in P4, the adjacent P4S1C and P4S2C microsamples exhibited similar gene expression; however, P4S3C, which was recovered from a more distant location to previous 2 (S14 Fig), revealed a distinct gene expression profile including expression of CST1, which is related to promoted cell proliferation (Fig 3B). The intratumor molecular variation may indicate a branched evolution of tumorigenesis that could provide insight for tumor diagnosis and treatment.

PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 3. Heterogeneity of gene expression. (a) Expression levels of genes in cancer samples among different patients, in comparison with epithelium. (b) The interpatient and intrapatient cancer heterogeneity. The bottom figure is a projection of 3 slides, showing the relative positions of different cancer nests. (c) Interpatient heterogeneity of gene fusion events. Source data for panel a in S2 Data Sheet 2.4. https://doi.org/10.1371/journal.pbio.3001699.g003

In order to characterize the cellular heterogeneity within microsamples, we aimed to deconvolve their composition using a published single-cell dataset that profiled transcriptomes of 6,000 single cells from 18 HNSCC patients [38]. We first embedded the RNA-seq data from our tumor microsamples into 20 principal components that capture gene expression variation in the single-cell data from Puram and colleagues (S15A Fig). Most of the cancer microsamples colocalized with the malignant epithelium cluster in the single-cell data, indicating similar gene expression profiles between our sections and individual malignant cells. However, 1 sample colocalized with the fibroblast and endothelial cell clusters. We investigated the cellular composition of these samples, which collectively contribute to their gene expression profile, by identifying the top 1,000 nearest neighbor cells from the Puram data. Then, for each sample, we calculated the percent of these top 1,000 cells that fell into each of the annotated cell clusters (S15B Fig). Five of our microsamples were composed primarily of malignant cells; however, the remaining 3 showed a more diverse composition including tumor resident fibroblasts and T cells. Interestingly, most of the nearest neighbor cells to sample P4S3C were indeed fibroblasts and endothelial cells.

Displacement and recombination of genes, especially oncogenes, has been recognized to drive neoplasia [39] and has become the focus of numerous cancer studies as potential therapeutic targets [40–42]. We then exploited de novo identification of gene fusion events through sequencing these samples. Early fusion events are usually sporadic and hence hard to identify with bulk sequencing. Furthermore, DNA sequencing of small samples with low input heavily relies on whole-genome amplification, which is prone to chimera formation and causes false identification of fusion events. We used RNA-seq of microsamples for de novo identification of gene fusion transcripts to investigate gene fusion events between patients and within tumor samples. Some fusion-related genes were shared between microsamples from separate patients, including KRT6A, which was shared between P1S2C and P3S5C, and FAM102A, which was shared between P1S3C, P3S5C, and P4S2C. Some samples, however, contained unique gene fusion patterns (Fig 3C, S1 Data 1.3 to 1.8). Most of these fusions might be passenger events that came along with cancer development and thus their actual consequences remain unknown. Among those, we found several sets that are consistent with previous observations, including 1 recorded in TCGA (MYH9 and KRT14, from P2) and 2 involving oncogenes AKT3 (fused with LRRC45, from P3) and MAFB (fused with SAC3D1, from P3) (S16 Fig, S2 Table, S3 Data). Sanger sequencing on amplified fragments harboring joint junction of fused genes confirmed these fusions between MYH9 (5′ fusion partner: exon 20) and KRT14 (3′ fusion partner: exon 8), AKT3 (5′ fusion partner: UTR), and LRRC45 (3′ fusion partner: intron), implying the capability of SMD-seq in recurrent fusion event discovery. Fusion events also revealed substantial intrasample heterogeneity (S17 Fig). For example, more fused genes were found in P4S3C, than in P4S1C and P4S2C which were dissected from distant regions of the tissue section, indicating spatially dependent genome instability during tumorigenesis. Moreover, we also observed the existence of an oncogene-involved fusion (RAB3D and MTMR14, verified by Sanger sequencing, S3 Data) in sample P4S2E (S18 Fig), which was identified as healthy epithelium by HE and SRS analyses, again indicating the possibility that this microsample was progressing through the early stages of tumorigenesis.

In parallel to RNA-seq, DNA-seq was performed using half of the lysate from microsamples, enabling the analysis of heterogeneity at the whole-genome level. Copy number alteration (CNA) analysis demonstrated unique patterns of CNAs between cancer and normal samples, and between patients (Figs 4A and S19), indicating the high complexity in OSCC including significant genetic mosaicism and genetic heterogeneity. We subsampled reads to characterize library complexity and read depth, demonstrating the 0.1× was sufficient for accurate and reproducible CNA (S20 Fig). A few commonly shared large-size ploidy shifts, such as the losses of 3p and 8p and the gains of 3q and 8q, which are shared in other squamous cell carcinomas [43,44], were also observed in our OSCC samples. This analysis also revealed patient-specific CNAs, for example, chromosome 6 showed high instability in P2’s cancer sample but this instability was not present in others (S21 and S22 Figs). Unsupervised clustering of the samples based on CNAs further demonstrated the unique CNA patterns between patients (Figs 4B and S21) [45]. Microsamples dissected from different locations within a patient sample displayed subtle discrepancies in copy number profiles (Fig 4B). The CNA pattern of chromosome 1 in P1S1C and P1S3C were different from that of P1S4C, with an obvious gain of 1q in the first 2 ROIs (S22 Fig).

PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 4. Genomic and transcriptomic analysis of laser dissected tissue samples. (a) Raw reads count across the whole genome of 2 paired cancer-normal tissues dissected from the same slice. Chromosome numbers were labeled at the bottom. (b) Unsupervised clustering of normalized reads counts of all the dissected cancer samples. (c, d) Top panels demonstrated the normalized read count (gray dots) and copy numbers (red lines) identified by CBS algorithm in chromosome 2 and 11 of sample P1S3C and P1S1C, respectively. Mean expression levels within the same segment were shown in black lines in the bottom panel. (e) Averaged read counts of all the cancer samples (top) and their corresponding gene expression values (bottom) of chromosome 18. Genome and transcriptome variations of other chromosomes and samples were shown in S16, S18, and S19 Figs. (f) Averaged gene expression fold changes were computed per 1 M bin across the whole genome and plotted against copy numbers. (g) The distribution of CNA and fusion genes across the genome of all the cancer samples. Orange and blue bars indicated the copy number gains and loss, respectively. Orange lines in the inner circle indicated fusion genes with at least 10 span pair reads, and the green and purple lines represented oncogenes involved gene fusions. The position of fusion events (magenta) and CNAs (light blue) of all the cancer samples were shown in the 2 outmost circles. Source data for panel b–g in S2 Data Sheet 2.5–2.20. CNA, copy number alteration. https://doi.org/10.1371/journal.pbio.3001699.g004

As both genomes and transcriptomes were sequenced from each cancer nest, we were able to perform joint analysis of genomic and transcriptomic variation. The mean expression level of genes within each genomic segment [46] was compared with the copy number in the same regions (Figs 4C–4E, S22, and S23) and summarized across the whole genome (Figs 4F and S24). The average gene expression levels showed positive correlation with the copy number within the same segment (Fig 4F). Among all the OSCC samples, 8 regions of recurrent copy number gain and 5 regions of recurrent copy number loss were identified (q < 0.25, S25 Fig, S1 Data 1.9 to 1.10) [43]. Among these (copy number loss/gain or regions), 11q13.3, 8q24.3, 11p15.4, and 11q24.2 colocalized with differently expressed genes in cancer samples. GSTP1 located within the recurrent focal amplification of 11q13.3 [44,47], which implied that the high expression level of GSTP1 may be the result of increased copy number. FAM83H was also colocalized with a focal amplification region, 8q24.3, and it specifically expressed at higher levels in patient P1. TP53AIP1 and PKP3 both expressed at a lower level in all the patients and located in the regions of recurrent copy number loss 11q24.2 and 11p15.4, respectively. The expression level of GSTP1, FAM83H, TP53AIP1, and PKP3 were all reported to be involved in the development of cancer or affecting patients’ survival rate [44,48]. We also analyzed the global correlation between gene fusion events and copy number variations and found a high degree of overlap between them (Fig 4G). Fusion genes CTSB and PPP1CA colocalized with focal amplification regions 8p23.1 and 11q13.3, separately. CTSB was proved to be related to cancer progression and metastasis [49,50], and PPP1CA was reported to contribute to ras/p53-induced senescence [51]. Of the 24 pairs of detected fused genes, 17 pairs (approximately 71%) had at least 1 gene intersected with a focal CNA (S1 Data 1.11). The parallel observation of genomic rearrangement and gene expression fold change may illustrate that the instability of the cancer genome led to gene fusion events that were more likely to occur within amplification and deletion regions (Fig 4G) [52].

[END]
---
[1] Url: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001699

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/