(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org.

(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org.
Licensed under Creative Commons Attribution (CC BY) license.
url:https://journals.plos.org/plosone/s/licenses-and-copyright

------------

Deep learning tools and modeling to estimate the temporal expression of cell cycle proteins from 2D still images

['Thierry Pécot', 'Rennes University', 'Sfr Biosit', 'Ums - Us', 'Rennes', 'Maria C. Cuitiño', 'Department Of Radiation Oncology', 'Arthur G. James Hospital Ohio State Comprehensive Cancer Center', 'Columbus', 'Ohio']

Date: 2022-04

Automatic characterization of fluorescent labeling in intact mammalian tissues remains a challenge due to the lack of quantifying techniques capable of segregating densely packed nuclei and intricate tissue patterns. Here, we describe a powerful deep learning-based approach that couples remarkably precise nuclear segmentation with quantitation of fluorescent labeling intensity within segmented nuclei, and then apply it to the analysis of cell cycle dependent protein concentration in mouse tissues using 2D fluorescent still images. First, several existing deep learning-based methods were evaluated to accurately segment nuclei using different imaging modalities with a small training dataset. Next, we developed a deep learning-based approach to identify and measure fluorescent labels within segmented nuclei, and created an ImageJ plugin to allow for efficient manual correction of nuclear segmentation and label identification. Lastly, using fluorescence intensity as a readout for protein concentration, a three-step global estimation method was applied to the characterization of the cell cycle dependent expression of E2F proteins in the developing mouse intestine.

Estimating the evolution of protein concentration over the cell cycle is an important step towards a better understanding of this key biological process. Unfortunately, experimental designs to monitor proteins in individual living cells are expensive and difficult to set up. We propose instead to consider 2D images from tissue biopsies as snapshots of cell populations to reconstruct the actual protein concentration evolution over the cell cycle. This requires to accurately localize cell nuclei and identify nuclear fluorescent proteins. We take advantage of the powerful deep learning technology, a machine learning approach which has revolutionized computer vision, to achieve these challenging tasks. Additionally, we have created an ImageJ plugin to quickly and efficiently annotate images or correct annotations, required to build training datasets to feed the deep convolutional neural networks.

Data Availability: Data availability All the images used in this study are available at https://data.mendeley.com/datasets/5r6kf37zd4/1 . The training datasets for nuclei segmentation are available at https://github.com/tpecot/NucleiSegmentationAndMarkerIDentification/tree/master/UNet/datasets/nucleiSegmentation_E2Fs for the U-Net architecture and at https://github.com/tpecot/NucleiSegmentationAndMarkerIDentification/tree/master/MaskRCNN/datasets/nucleiSegmentation_E2Fs for the Mask R-CNN architecture. The training datasets for nuclei segmentation and marker identification are available at https://github.com/tpecot/NucleiSegmentationAndMarkerIDentification/tree/master/InceptionV3/trainingData for the Inception-V3 architecture. The images used for the evaluation and the ground truth are available at the same locations as the training datasets. The intensity 2D histograms used to estimate the E2Fs accumulation over the cell cycle are available at https://github.com/tpecot/EstimationOfProteinConcentrationOverTime/tree/master/data . Software availability The codes used to train and process deep learning approaches for nuclei segmentations and marker identification are available at https://github.com/tpecot/NucleiSegmentationAndMarkerIDentification . Archived code at time of publication: https://doi.org/10.5281/zenodo.4619243 [ 50 ] License: GPL3 The Octave code used to estimate the E2Fs accumulation over the cell cycle is available at https://github.com/tpecot/EstimationOfProteinConcentrationOverTime . Archived code at time of publication: https://doi.org/10.5281/zenodo.4639800 [ 51 ] License: GPL3 The Java code of the Annotater plugin and the plugin are available at https://github.com/tpecot/Annotater . Video tutorials to show how to use the Annotater are available at the same location. Archived code at time of publication: https://doi.org/10.5281/zenodo.4639802 [ 52 ] License: GPL3.

In this manuscript, we propose and evaluate alternative methods to quantify nuclear protein levels which result in an improved automated pipeline with greatly reduced requirement for interactive manual corrections. Using a small training dataset composed of 2D still images (with and without various forms of data augmentation), we first evaluate five different deep learning strategies to segment nuclei in microscopic images of embryonic mouse intestinal epithelium. We also design post-processing methods to improve nuclear segmentation. We then propose another deep learning-based approach for identifying nuclear markers in the epithelial cells and demonstrate the superiority of this method to the usual threshold based method [ 20 , 21 ]. Additionally, we create an ImageJ plugin [ 22 , 23 ] named Annotater to specifically and efficiently correct nuclear segmentation and marker identification, ensuring that nuclear features are accurately quantified. Next, these image features extracted from 2D still images are used to perform a temporal analysis of E2F protein concentration over the cell cycle. Based on three mathematical assumptions grounded in cell biology, we initialize the temporal evolution of E2F concentrations using a graph optimization method. Cell cycle markers are then used to temporally register E2F proteins’ concentration with respect to cell cycle phase. The global estimation of the protein concentration of E2F3A, E2F4 and E2F8 through the cell cycle is defined as an assignment problem and solved with the Hungarian algorithm [ 24 , 25 ]. This approach is extensively evaluated with simulated data. Finally, we directly estimate the temporal evolution of E2F concentrations without using marker identification. In addition, we evaluate the impact of using different amount of images in the training datasets for nuclei segmentation on the estimation of E2F concentrations over the cell cycle.

E2Fs are major regulators of the cell cycle. The members of this family of transcription factors are categorized into three subclasses in mammals: canonical activators (E2F1–3A/B), canonical repressors (E2F4–6), and atypical repressors (E2F7–8) [ 10 – 13 ]. Adding to the body of literature on E2F-dependent transcriptional activity in vivo, our lab previously provided quantitative evidence on the temporal expression of representative activator (E2F3A), canonical repressor (E2F4) and atypical repressor (E2F8) family members during embryonic development [ 14 ], all of which have been shown to be of major importance [ 15 – 18 ]. To establish the temporal expression profiles of the three sentinel E2Fs, we used an E2F3A specific antibody and generated MYC-tagged E2F4/E2F8 knock-in mice. In addition to fluorescence labeling of E2F3A, E2F4 and E2F8, 5-ethynyl-2’-deoxyuridine (EdU) and Histone H3 S10 phosphorylation (pH3) were used to identify S, G2 and M phases. Images of eight different combinations of markers were acquired from sections of the developing mouse intestine using confocal and widefield microscopy (see S1 Fig ). The data analysis pipeline consisted of i) nuclear segmentation with a deep learning approach [ 19 ], ii) nuclear marker identification by thresholding, and iii) estimation of E2F concentrations over the cell cycle from 2D intensity histograms.

Automatic image analysis is at the core of human and animal tissue-based research. However, quantitation of morphological features or fluorescent labeling in intact mammalian tissues still remains a challenge. The densely packed nuclear aggregates that characterize many of these tissues, the extensive variability across different tissue types, and the continuously increasing number of imaging modalities are some of the many variables that make tissue biological quantification an extremely difficult task. Over the last decade [ 1 – 4 ], deep learning has brought artificial intelligence to the forefront of image-based decision making. In particular, deep convolutional neural networks have demonstrated their superiority for image segmentation [ 1 , 2 ]. These approaches have also outperformed the traditional approaches used in microscopy, such as watershed for nuclei or cell segmentation [ 5 – 9 ]. However, this machine learning-based approach requires large amounts of annotated data and new strategies have to be developed to process highly complex biological objects acquired with different modalities by considering small training datasets. In this paper, we propose a series of deep learning-based approaches to precisely segment nuclei and to identify fluorescently labelled cells in order to analyze the evolution of cell cycle dependent E2F protein concentration in mouse tissues.

Results

Deep learning improves identification of fluorescent nuclear markers After completion of nuclei segmentation with the DAPI channel, it is possible to use the fluorescence in the other channels to extract information of interest about the cells. E2Fs positive/negative status can be evaluated that way, as well as EdU and pH3 patterns. EdU and pH3 are cell cycle markers that show evolving patterns along the cell cycle [14]. EdU is diffuse during first half of S phase and becomes punctate during second half of S phase. pH3 is first punctate during second half of S phase and G2, and becomes diffuse during mitosis. Typically, a thresholding procedure is applied to identify nuclear markers [20, 21], but this approach is not always accurate, especially when different patterns of fluorescence over a wide range of intensities are involved as is the case for EdU and pH3 (see diffuse and punctate patterns in Fig 2). To improve accuracy, we tested a deep learning approach for nuclear marker identification. As the goal is not to identify regions but to make a decision for each nucleus regarding the presence/absence of an E2F or the diffuse/punctate/absence of EdU or pH3, the instance segmentation and U-Net approaches are not suitable. In contrast, with an input defined as an image patch centered on each nucleus, the Inception-V3 architecture is appropriate to decide about the presence, potentially in diffuse or punctate state, or absence of a marker. In addition to data augmentation, we also define a so-called pixel-based training dataset (as opposed to nuclei-based training dataset) that includes the input patches centered at each pixel belonging to the nuclei (see Materials and methods). This strategy has a similar effect to data augmentation, as it drastically increases the training dataset. Although DAPI staining is different from the nuclear markers used to identify the E2Fs, EdU and pH3, the images are acquired simultaneously, so the image features captured by the Inception-V3 method for nuclear segmentation are potentially meaningful to identify nuclear markers. Consequently, we also perform transfer learning from the nuclear segmentation (see Materials and methods). To easily set the threshold for marker identification, we designed an interface in the Annotater that we used to obtain the results shown in Fig 2 for manual thresholding. PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 2. Comparison between Inception-V3 and manual thresholding for marker identification. a-b Top rows (images): examples of E2F3A-, E2F8- and E2F4-positive and -negative nuclei, as well as EdU- and pH3-negative, diffuse and punctate nuclei in a confocal and b widefield images. Bottom rows (bar graphs): Accuracy obtained with the Inception-V3 and manual thresholding approaches for marker identification of E2F3A, E2F8, E2F4, EdU and pH3 in a confocal and b widefield images. https://doi.org/10.1371/journal.pcbi.1009949.g002 As shown in Fig 2, compared to manual thresholding, the Inception-V3 approach provides better performance for each marker in both modalities. Manual thresholding achieves a relatively good performance for E2F3A and E2F8 identification in both modalities, and for pH3 in confocal images. The latter might appear surprising, but the two different patterns for pH3 in confocal images are different enough to allow a strategy based on the thresholded area in the nuclei (see Materials and methods) to lead to satisfying results (Fig 2a). However, the results for E2F4, Fig 2b, are not as good: Because E2F4 is also cytoplasmic, the extra-nuclear fluorescence confounds the thresholding decision. Finally, thresholding clearly fails to identify the two different patterns (diffuse, punctate) of EdU in images from both modalities as well as the patterns of pH3 in widefield images. In contrast, the Inception-V3 approach yields accuracies greater than 90% for all markers except E2F4 (89%) (Fig 2). As shown in S7 Table, the use of the pixel-based training datasets does not significantly improve marker identification for confocal images, but does improve performance on E2F3A, E2F4 and pH3 markers in widefield images (see S8 Table). Additionally, the transfer learning from the nuclear segmentation slightly improves the results for all markers in both modalities. Computation (after training) is fast because only one decision is made per nucleus (S9 Table). Overall, this study demonstrates the remarkable accuracy of the deep learning-based approach in identifying cells that are positive for the tested nuclear markers, not only when the marker is both nuclear and cytoplasmic, but also when it exhibits different labeling patterns, such as diffuse and punctate. Given such a high level of accuracy, correcting the results with the Annotater plugin takes only a short time.

[END]

[1] Url: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009949

(C) Plos One. "Accelerating the publication of peer-reviewed science."
Licensed under Creative Commons Attribution (CC BY 4.0)
URL: https://creativecommons.org/licenses/by/4.0/

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/