(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org.

(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org.
Licensed under Creative Commons Attribution (CC BY) license.
url:https://journals.plos.org/plosone/s/licenses-and-copyright

------------

Blind demixing methods for recovering dense neuronal morphology from barcode imaging data

['Shuonan Chen', 'Mortimer B. Zuckerman Mind Brain Behavior Institute', 'Columbia University', 'New York', 'United States Of America', 'Department Of Statistics', 'Center For Theoretical Neuroscience', 'Grossman Center For The Statistics Of Mind', 'Department Of Neuroscience', 'Department Of Systems Biology']

Date: 2022-05

Abstract Cellular barcoding methods offer the exciting possibility of ‘infinite-pseudocolor’ anatomical reconstruction—i.e., assigning each neuron its own random unique barcoded ‘pseudocolor,’ and then using these pseudocolors to trace the microanatomy of each neuron. Here we use simulations, based on densely-reconstructed electron microscopy microanatomy, with signal structure matched to real barcoding data, to quantify the feasibility of this procedure. We develop a new blind demixing approach to recover the barcodes that label each neuron, and validate this method on real data with known barcodes. We also develop a neural network which uses the recovered barcodes to reconstruct the neuronal morphology from the observed fluorescence imaging data, ‘connecting the dots’ between discontiguous barcode amplicon signals. We find that accurate recovery should be feasible, provided that the barcode signal density is sufficiently high. This study suggests the possibility of mapping the morphology and projection pattern of many individual neurons simultaneously, at high resolution and at large scale, via conventional light microscopy.

Author summary In situ barcode sequencing allows us to simultaneously locate many neurons in intact brain tissues, albeit at modest spatial resolution. By increasing the barcode density, high-resolution neuronal morphology reconstruction from such data might be possible. Here we use simulations to study this possibility, while addressing the computational challenges in analyzing such data. We developed a novel blind demixing method that uses fluorescent images and identifies the unknown barcodes used to label the neurons with high accuracy. Further, we developed a neural network which can reconstruct the morphology for these labeled neurons from the observed ‘pointilistic’ imaging data. We show that under both high- and low-resolution optical settings, our methods can successfully extract the morphologies for many labeled neurons. The results from this theoretical study suggest that it may be feasible to map the morphology and projection pattern of many individual neurons simultaneously, at high resolution and at large scale, via conventional light microscopy.

Citation: Chen S, Loper J, Zhou P, Paninski L (2022) Blind demixing methods for recovering dense neuronal morphology from barcode imaging data. PLoS Comput Biol 18(4): e1009991. https://doi.org/10.1371/journal.pcbi.1009991 Editor: Hermann Cuntz, Ernst-Strungmann-Institut, GERMANY Received: August 18, 2021; Accepted: March 7, 2022; Published: April 8, 2022 Copyright: © 2022 Chen et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: The data underlying the results presented in the study is available from https://github.com/jacksonloper/bardensr. Funding: This work was supported by the National Institutes of Health (1U19NS107613 to L.P.), IARPA MICrONS (D16PC0003 to L.P.), and the Chan Zuckerberg Initiative (2018-183188 to L.P.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist.

1 Introduction Neuroscientists have long dreamed of obtaining simultaneous maps of the morphology of every neuron in a mammalian brain [1, 2]. The ability to perform this very high-throughput neuron-tracing would enable better understanding of brain development, neural circuit structure, and the diversity of morphologically-defined cell types [3, 4]. In order to map the morphology at the scale of a mammalian brain, an ideal experiment would trace a large number of neurons simultaneously over a large brain region with high imaging resolution. Current experimental methods for neuronal tracing can be placed along a spectrum spanning two disparate spatial scales. On the one hand, techniques based on Electron Microscopy (EM) obtain nanometer-level resolution within small regions. On the other hand, light microscopy and molecular techniques can map morphology over whole-brain scales, albeit with relatively low spatial resolution and/or limitations on the number of neurons that can be mapped simultaneously. Fig 1 sketches the current landscape, and highlights a gap in the state of the art: mapping many neurons simultaneously, with high resolution, over large fields of view (or potentially even the whole brain). PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 1. Some existing methods for neuronal morphology mapping. Here we non-exhaustively summarize the landscape of existing neuronal tracing methods. The horizontal axis indicates the typical scale of these neuronal tracing methods, ranging from local tracing to mapping axonal projections across the whole brain; the vertical axis indicates optical resolution in the typical images obtained from these methods. The ellipse color represents the number of neurons that can be traced and distinguished; red indicates that the method cannot readily distinguish between spatially nearby neurons, yellow indicates that the method can be used to trace several individual neurons, and green indicates the possibility of simultaneously mapping thousands or millions of neurons. Here ‘standard tracing methods’ refer to approaches such as Golgi staining and viral tracers; cf. Section 2 for more details. Note that many of these methods rely on conventional optical microscopes, and their resolution can potentially be increased via expansion microscopy and/or super resolution microscopy, though this may incur additional experimental costs. In this paper we focus on the possibility of extending molecular barcoding methods by increasing the signal density to allow for more fine-grained morphological reconstructions; we call such experiments Spatial Transcriptomics-based Infinite-color Brainbow Experiments (STIBEs). https://doi.org/10.1371/journal.pcbi.1009991.g001 Molecular barcoding methods may be able to fill the gap [5, 6] with ‘infinite-color Brainbow’ experiments: each cell is labeled with a unique random barcode which can be visualized through conventional light microscopy. If we think of each barcode as providing a unique ‘pseudocolor’ for each cell, then conceptually this method ‘colors’ the brain with a potentially unbounded number of random colors, generalizing the original Brainbow approach [7], which can trace long-range neuronal morphology but was limited to a relatively small number of random colors (though see e.g. [8] for more recent developments). Here we use simulations to investigate a class of methods we dub Spatial Transcriptomics-based Infinite-color Brainbow Experiments (STIBEs). Can such experiments obtain precise morphological reconstruction? Can we potentially obtain long-range tracing for many neurons simultaneously, if the experiment is done at a sufficienctly large scale? This class of experiments uses techniques from the spatial transcriptomics community to label cells with unique fluorescent barcodes, in-situ; the BAR-seq method is a representative example [5]. STIBEs work by creating a collection of transcripts containing barcode sequences; these transcripts are amplified into target molecules known as ‘amplicons,’ and the barcode associated with each amplicon can be identified over a series of rounds of imaging. In an ideal STIBE, all amplicons within a given cell carry the same barcode, no two cells carry amplicons with the same barcode, and amplicons are only present within cells. By identifying collections of amplicons which carry the same barcode, one can thus trace many neurons across the entire brain. Previous experiments have already demonstrated that STIBEs can be used to recover the locations of somas and projection sites of many cells (at lower spatial resolution) [5, 9]. In contrast, here we focus on STIBEs that could achieve micron-resolution morphological reconstruction of many cells simultaneously, potentially across the whole brain. This would be a valuable advance for several reasons. First, cell morphology measured at this scale provides a great deal of information about cell type in different layers and regions of the brain [10, 11]. Second, although micron-level resolution does not permit exact connectome reconstruction, it does yield a matrix of potential connections between neurons (i.e., which cells might be connected to each other). Such matrices would be invaluable for downstream applications such as functional connectivity mapping (by providing constraints on possible connections) [12] or cell type inference [13–15]. To accurately simulate the imagestacks that would be generated by STIBEs, we use two sources of existing data. First, we use densely-reconstructed EM data to give us the shapes and spatial relationships between axons and dendrites in a small cortical volume. Second, we replicate the noise and signal structure we might expect, based on results from current cellular barcoding experiments [5, 6]. We investigate how varying parameters of STIBEs would affect the accuracy of neuronal tracing; for example, we consider various imaging resolutions and various densities of the amplicons present within cells. We find that selecting an appropriate density of amplicons is a balancing act: we cannot recover precise morphological reconstructions without sufficient density, but very high density can lead to incorrect amplicon identification along thin processes due to limited optical resolution. Our main technical contribution here is to develop improved demixing algorithms that enable improved barcode recovery in the high-density regime, leading in turn to improved morphological recovery. These algorithms are based on the BarDensr model of spatial transcriptomics data [16] (see also [17], as well as [18, 19]), which was originally developed to detect barcode signals from spatial transcriptomics images when the barcode library is known. We have extended this method to a more challenging ‘blind demixing’ setting, where we iteratively learn the barcodes directly from the available image data, so that it is applicable to STIBEs where the barcode library is unknown or intractably large. We further developed a Convolutional Neural Network (CNN) model to reconstruct morphology for each demixed barcode, ‘connecting the dots’ between the discontiguous fluorescent signals produced by amplicons. In what follows, we contextualize STIBEs among the family of other neuronal tracing methods, describe our methods for simulation and image analysis, and finally report our findings on the feasibility of STIBEs for the morphological reconstruction of neurons.

2 Related work The history of neuronal tracing techniques is long and complex [20]; here we give a brief overview of some techniques most directly relevant to our approach. Fig 1 presents a visual overview. At one extreme of the precision-scale spectrum, Electron Microscopy (EM) is the state-of-the-art tool for visualizing a small region of brain with very high precision [14, 15, 21–25]. However, it is not yet suitable for dense mapping of long-range neuronal projections: the imaging (and computational segmentation) process for these experiments is challenging, and it is not yet feasible to use EM to densely trace many neurons across the mammalian brain. It may be possible to refine EM techniques to overcome these challenges, but this is not our focus in this paper (cf. [2] for further discussion). At the opposite extreme of the precision-scale spectrum, we have standard tracing approaches utilizing light microscopy. Despite significant effort [26], it remains infeasible to segment and trace thousands of densely-packed individual neurons visualized with a single tracing marker. The ‘MouseLight’ project [27] is a modern exemplar of this ‘classical’ approach. This is a recently developed platform for tracing the axonal arbor structure of individual neurons, built upon two-photon microscopy and viral labeling of a sparse set of neurons, allowing long-range neuronal tracing. Unfortunately, the number of neurons that can be traced at once is currently limited to a hundred per brain sample [28] (though more advanced microscopy technology, such as super-resolution microscopy [29–33], and/or Expansion Microscopy [34–37], could potentially be deployed to simultaneously trace more individual neurons with higher resolution). ‘Brainbow’ methods [7, 38] introduce random combinations of fluorescent markers to facilitate tracing of multiple cells. The number of neurons that can be uniquely labeled and identified by the original method was limited to several hundreds, because of limitations on the number of distinct colors that can be generated and distinguished. Recent studies have focused on reducing these bottlenecks [8, 39, 40], and developing improved computational methods [41, 42]. More recently, molecular barcoding methods have been developed to effectively remove this bottleneck on the number of distinct unique ‘color’ labels that can be assigned and read from different cells. Multiplexed Analysis of Projections by Sequencing (MAPseq) is one early example of this approach, developed to study long-range axonal projection patterns [9]; however, the spatial precision of this approach was not sufficient to capture neuronal morphology. More recently, “barcoded anatomy resolved by sequencing” (BARseq) [5, 6] combines MAPseq and in situ sequencing technology to obtain higher spatial resolution. Instead of micro-dissecting the brain tissue, this method uses fluorescent microscopy to detect the barcode signal from the intact tissue, similar to recent spatial transcriptomics methods [43, 44]. In theory, BARseq should be capable of satisfying all the desirable features of neuronal tracing: uniquely labelling large numbers of individual neurons, representing them with high spatial precision, and tracing them across long distances through the brain. Spatial transcriptomics technology is advancing quickly; for example, [45] recently achieved single-cell-resolution transcriptomic assays of an entire embryo. As this technology advances, techniques like BARseq can only become more effective. However, to date, this technology has only been used with a relatively small number of amplicons per cell, prohibiting accurate morphological reconstruction (c.f. Brainbow images, where signal within cells tends to be much more uniform and less ‘pointilistic,’ leading to a different class of segmentation problems). In theory, there is nothing preventing new experiments with higher density, to achieve higher-resolution morphology mapping; this paper will explore this idea systematically. Finally, we note the related paper [46]; this previous simulation work focused on a different experimental context, specifically expansion microscopy and cell membrane staining with fluorescent markers. In the current work we are interested in exploring the frontier of morphology mapping using only molecular barcoding images and conventional light microscopy. Introducing expansion microscopy and additional fluorescent labels could potentially improve the resolution of the methods studied here, at the expense of additional experimental complexity.

3 Methods summary We here summarize the three main procedures used in this work; full details can be found in S1A Appendix. 3.1 Data simulation In order to investigate feasible experimental conditions for densely reconstructing neuronal morphology, we simulated a variety of STIBEs. The simulation process is illustrated in Fig 2. We started with densely segmented EM data [14, 15, 47]. Next we generated a random barcode library, assigning random barcodes to each cell in the field of view (FOV). We simulated amplicon locations according to a homogeneous Poisson process within the voxelized support of each cell. The final imagestack was generated by assigning a colored spot to each barcode location in each imaging frame, and then pushing the resulting high-resolution 3D ‘clean’ data through an imaging model that includes blurring with a point spread function, sampling with a lower-resolution voxelization, and adding imaging noise, to obtain the simulated observed imagestack. PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 2. Simulation process illustration. Top left: Three neuronal segments in a small region (40 × 40 × 40 voxels with 100 × 100 × 100 nm voxel size). The original EM data is much more densely packed than what is shown here; for illustration purposes, we only show three neurons among many. Top middle: A 2D slice with each neuronal segment uniquely colored (left) and the simulated imagestack at the same plane for the first sequencing round (right). The imagestack colors do not correspond to neural segment colors, but rather to the corresponding fluorescent barcodes (top right). A video corresponding to this plot, showing multiple z-planes, can be found at this link. Bottom: Amplicons are uniformly simulated within neuronal segment volumes; simulated amplicons are shown as red dots. https://doi.org/10.1371/journal.pcbi.1009991.g002 3.2 Barcode estimation To use a STIBE for morphological reconstruction, one must estimate the barcodes present in a FOV (each barcode corresponding to one cell) and the locations of each amplicon corresponding to each barcode (these amplicon locations pointilistically trace the morphology of each cell). The barcode library used to infect the neurons is unknown in such data. In theory, one could sequence the viral library, but this may yield a set of barcodes which is too large to be useful [48]. For the purpose of morphology reconstruction—where only a small number of cellular barcodes exist in a region of interest—we must develop an approach to ‘learn’ the local barcode library from the images themselves. We found that there are effectively two distinct regimes for barcode and amplicon recovery. In the ‘sparse’ regime, where imaging resolution is high and the amplicon density is low, the signal in most imaging voxels is dominated by at most one amplicon. Thus we can estimate the barcode library simply by searching for bright, ‘clean’ imaging voxels displaying a single amplicon signal to estimate the amplicon locations using the resulting barcode library. In the ‘dense’ regime (with high amplicon density and/or low imaging resolution) it is harder to find voxels dominated by a single amplicon, and the simple approach described above breaks down. Instead, we have found that an iterative constrained non-negative matrix factorization approach is more effective: given an initial (incomplete) estimate of the barcode library, we estimate the corresponding amplicon locations, then subtract away the estimated signal corresponding to these amplicons. This sparsens the remaining image, making it easier to detect more barcodes that might have been obscured in previous iterations. After augmenting the barcode library with these previously undetected barcodes, we can re-estimate the amplicon locations and iterate between these two steps (updating the barcode library and estimating amplicon locations in alternating fashion) until a convergence criterion is satisfied. This approach takes full advantage of our knowledge of the special structure of this blind-demixing problem; classic blind-demixing approaches (e.g. regular non-negative matrix factorization) that do not exploit this problem structure do not perform well here. 3.3 Amplicon estimation and morphological reconstruction Given an estimated barcode library, our primary goal is to reconstruct the morphology of each neuron labeled by a barcode in this library. Our starting point in this endeavor is the ‘evidence tensor’ (see S1 Appendix for a precise definition), which summarizes our confidence about the presence of each barcode at each voxel location. This evidence tensor is used as the input to two algorithms: alphashape [49] and Convolutional Neural Networks (CNNs), which are trained to estimate the shape of each neuron from the evidence tensor.

5 Conclusion In this work we developed detailed simulations to test the feasibility of using Spatial Transcriptomics-based Infinite-color Brainbow Experiments (STIBEs) for morphological reconstruction of many neurons simultaneously. We developed a novel blind-demixing algorithm, followed by a neural network based reconstruction approach, which together achieve high barcode detection rates and accurate morphological reconstructions, even in relatively low-resolution settings. There may be room to improve further on the methods proposed here, on at least two fronts. In the first stage, we use an iterative approach to detect barcodes: we model the presence of barcodes that are already in our library, and search the residual images for new barcodes to add to our library. We found that the proposed linear programming approach (building on previous work described in [16]) was effective for decomposing the observed signal into the sum of these two parts (i.e., the signal that can be explained by barcodes we already know and the left over residual), but in future work it may be possible to improve further, using e.g. neural networks trained to decompose images using a simulator similar to the one used here [50]. Second, given an estimated barcode library, we need methods for recovering the morphology corresponding to each barcode from the imagestack. One promising direction would be to use more sophisticated approaches for 3D neuronal recovery, adapting architectures and loss functions that have proven useful in the electron microscopy image processing literature [51, 52]. We hope to explore these directions in future work. Finally, we should emphasize that this study focused on a small region from a mouse brain, densely reconstructed with electron microscopy. Extending these methods to a whole-brain scale seems feasible but will require significant effort, involving stitching and registering data across many spatial subvolumes analyzed in parallel. As experimental methods continue to march forward and expand in scale [36] we hope to tackle these computational scaling issues in parallel.

Supporting information S1 Appendix. Detailed methods. https://doi.org/10.1371/journal.pcbi.1009991.s001 (PDF) S1 Fig. Finding the threshold for the evidence tensor, for high (top) and low (bottom) resolution data. In order to create input data for the morphology prediction in Figs 8–11, we needed to binarize the evidence tensors. Here we show the test cases (one row shows one example neuron) where we used various thresholds to binarize the evidence tensors (the third columns and the following) and compared the result with both original ground-truth voxels (first columns), as well as the original (continuous) evidence tensor (second columns). For the high resolution simulation, we used 0.7 as the threshold since the result was robust. This threshold was used to visualize Figs 8 and 9. For the low resolution simulation, we used BarDensr to estimate the evidence tensor and the results were much sparser than the high resolution case, and therefore the binarization was relatively robust to the choice of the threshold. Here we used 0.1 for the low resolution case. This threshold was used to visualize Figs 10 and 11. https://doi.org/10.1371/journal.pcbi.1009991.s002 (TIFF) S2 Fig. Missed barcodes in the real data (Fig 6). Here we show two examples of the amplicons that are from the barcodes that were not found in our amplicon detection process in Fig 6). For both panels, the red frames indicate where the correct barcode signals are expected. The left panel shows the missed barcode which had the most abundant amplicon. We see that the signal intensity varies significantly (e.g., round 2 has much higher signal compared to round 4 and 5). The right panel showed missed barcodes with the third most abundant amplicons. From round 2, 4 and 7 we see there is a relatively high phasing and/or color-mixing happening in the channel 1, which might have made the barcode discovery difficult. https://doi.org/10.1371/journal.pcbi.1009991.s003 (TIFF)

Acknowledgments We thank Xiaoyin Chen for sharing the data and helping with the analysis. We thank Li Yuan, Tony Zador, and Abbas Rizvi for many helpful discussions.

[END]

[1] Url: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009991

(C) Plos One. "Accelerating the publication of peer-reviewed science."
Licensed under Creative Commons Attribution (CC BY 4.0)
URL: https://creativecommons.org/licenses/by/4.0/

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/