(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .
An interactive deep learning-based approach reveals mitochondrial cristae topologies [1]
['Shogo Suga', 'Department Of Chemistry', 'Biotechnology', 'School Of Engineering', 'The University Of Tokyo', 'Tokyo', 'Koki Nakamura', 'Yu Nakanishi', 'Bruno M. Humbel', 'Imaging Section']
Date: 2023-09
The convolution of membranes called cristae is a critical structural and functional feature of mitochondria. Crista structure is highly diverse between different cell types, reflecting their role in metabolic adaptation. However, their precise three-dimensional (3D) arrangement requires volumetric analysis of serial electron microscopy and has therefore been limiting for unbiased quantitative assessment. Here, we developed a novel, publicly available, deep learning (DL)-based image analysis platform called P ython-based h uman- i n-the- lo op w orkflow (PHILOW) implemented with a human-in-the-loop (HITL) algorithm. Analysis of dense, large, and isotropic volumes of focused ion beam-scanning electron microscopy (FIB-SEM) using PHILOW reveals the complex 3D nanostructure of both inner and outer mitochondrial membranes and provides deep, quantitative, structural features of cristae in a large number of individual mitochondria. This nanometer-scale analysis in micrometer-scale cellular contexts uncovers fundamental parameters of cristae, such as total surface area, orientation, tubular/lamellar cristae ratio, and crista junction density in individual mitochondria. Unbiased clustering analysis of our structural data unraveled a new function for the dynamin-related GTPase Optic Atrophy 1 (OPA1) in regulating the balance between lamellar versus tubular cristae subdomains.
Funding: This work was supported by following financial sources. The Japan Society for the Promotion of Science (
https://www.jsps.go.jp/english/index.html ) KAKENHI under Grant Number 20H04898 (Y.H.), Japan Agency for Medical Research and Development (
https://www.amed.go.jp/index.html ) under Grant number JP19dm0207082 (Y.H.), Basis for Supporting Innovative Drug Discovery and Life Science Research (BINDS) from AMED under grant numbers 19am0101116j0003 (B.M.H.) and 20am0101116j0004 (B.M.H.), The Japan Society for the Promotion of Science KAKENHI under Grant Number 22J23099 (K.N.), 22J23115 (S.S.), 21K19253 (Y.H.), 20K22622 (H.K.), SECOM Science and Technology Foundation Research grant (
https://www.secomzaidan.jp/ ) (Y.H.), and the Uehara memorial foundation research grant (
https://www.ueharazaidan.or.jp/ ) (Y.H.) and Chan Zuckerberg initiative napari Ecosystem Grants (
https://chanzuckerberg.com/science/programs-resources/imaging/napari/ ) (H.K.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Data Availability: All raw data for quantification summary of the electron microscopy images are within the paper and its Supporting Information files. All the original electron microscopy images are available from a deposit site below.
https://www.ebi.ac.uk/empiar/EMPIAR-11449/ All original code has been deposited at
https://github.com/neurobiology-ut/PHILOW and is publicly available as of the date of publication. Plasmids encoding generated in this study have been deposited to Addgene (plasmid numbers 206316 and 206319).
To provide a general solution to this problem, we developed a new integrated platform called P ython-based h uman- i n-the- lo op w orkflow (PHILOW), equipped with active learning for efficient iterative training data generation, and a new 3D structure prediction algorithm using 2D training datasets. Implementation of this platform drastically reduced the amount of human labor required for segmentation while increasing the precision of segmentation. This allowed high-throughput cristae analysis at nanometer resolution in micrometer scale. We successfully reconstructed a comprehensive structure of mitochondria and cristae from 135 control and 324 OPA1-deficient mitochondria. This unprecedented nanoscale ultrastructural analysis in a cellular context determined the total surface area, orientation, and spatial arrangements of individual mitochondria. It also revealed a previously unappreciated abundance of tubular crista structures in a mouse fibroblast cell line. Using this comprehensive structural analysis, we also revealed novel roles for OPA1, a protein best characterized for its role in IMM fusion, in the regulation of crista ultrastructural feature.
With the increased throughput of image acquisition in FIB-SEM and related 3D EM approaches comes an unsolved challenge with limited throughput and accuracy in image segmentation. Because EM visualizes virtually all membrane structures in grayscale, segmenting structures of interest is an indispensable and highly challenging process. In most image analyses, defining thresholds for differentiating an object of interest over background or other structures is an essential step. Although computer-based image analyses have expanded parameters for defining objects, it has been challenging to determine proper thresholding values for extracting an object of interest. Even with traditional machine learning, it remains challenging to define thresholds for segmenting objects from images with diverse textures and especially with complex 3D structures. Recent advances in the application of deep learning (DL), which does not require thresholding for image classification, object detection, and pixel segmentation, are expected to transform image analysis in biological studies with versatility and efficiency [ 15 ]. However, while image processing speed increased, laborious manual processing, such as generating training data, proofreading prediction results, and converting files for processing images across multiple software programs, hampers application of DL for solving actual biological questions.
Transmission electron microscopy (TEM), especially electron tomography (ET), has played pivotal roles in describing the ultrastructural features of IMM organization because of its unparalleled resolution. It revealed that cristae can adopt either flat structure (lamellar cristae) or tube (tubular cristae) [ 8 , 9 ]. For example, mitochondria in cardiomyocytes have densely packed lamellar cristae, while neuronal mitochondria are largely composed of tubular cristae. However, the nanometer-size volume of ET was not sufficient for resolving the entire crista structure in micrometer-scale individual mitochondria. In order to fill this gap in scale between crista structure and the organelle itself, techniques combining serial sectioning and scanning EM (serial scanning EM, sSEM; SEM tomography) have been developed. In particular, focused ion beam-scanning electron microscopy (FIB-SEM) lends itself well to the investigation of ultrastructural features in 3D because of its isotropic X-Y-Z resolution in the low nanometer range [ 10 – 14 ]. In addition, FIB-SEM covers volumes with tens of micrometers in scale.
Optimizing the metabolic capacity of mitochondria requires convolution of the inner mitochondrial membrane (IMM) into cristae [ 1 ]. Cristae are specialized IMM infoldings where the entire set of 5 macromolecular complexes underlying the electron transport chain (ETC) and oxidative phosphorylation (OxPhos) are localized [ 2 , 3 ]. Since the narrow compartment of cristae represents a significant bottleneck for the diffusion of ions and metabolites, proper control of crista structure is important for metabolic homeostasis [ 4 – 7 ]. Despite their critical importance, quantitative and unbiased measurements of the structural features of IMM and cristae still represent a major roadblock for the study of mitochondrial biology.
Results
Three-axes prediction (TAP) reduces section-to-section inconsistency of inference When mitochondrial structures were predicted by a conventional 2D UNet++ model [17], the prediction yielded an F1 score mostly under 0.94. To improve the F1 score, we first aimed to reduce section-to-section inconsistency, which is ascribed to inaccurate predictions at the boundaries of mitochondria (S1 Fig). We reasoned that inference from a side view of the image volume would decrease the inconsistency and thus developed a three-axes prediction (TAP) method (Fig 2A). With this method, a prediction model trained with a xy-plane dataset is applied to virtual yz and zx-planes. Especially, when an image volume consists of isotropic voxels, a model generated from xy-planes is directly applied to additional 2 planes and intermediate predictions from 3 axes are produced. By majority vote of the intermediate predictions, a final prediction at each voxel is determined. When TAP was applied to segment mitochondria from a 10 × 10 × 10 nm isotropic volume (309 × 732 × 1,554 voxels, 10 out of 309 slices were used for training and all 309 slices were applied for test), strikingly, the variance of intersection over union (IoU) between neighboring xy-slices of inferences was 12 times lower compared to that without TAP (0.0080 and 0.0955 with or without TAP, respectively) and 3 times lower than manual segmentation (0.0233) (Fig 2B and S2 Data). This showed that boundaries determined by the TAP method are more consistent along serial images and even superseded the human expert (Fig 2C–2E). As a result of introducing TAP, the F1 score was increased to the level we aimed for (0.96 with TAP, 0.94 without TAP). PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 2. TAP method enables a precise segmentation at the periphery of mitochondria. (A) A diagram explaining the TAP method. A model trained with sparse annotation on xy-plane images was applied for the virtual yz (red)- and zx (green)-plane images as well as xy (blue)-plane images. Therefore, each voxel has 3 prediction results from 3 axes. An example of 3 predictions (green, red, and blue) visualized from 3 axes are shown in the middle. A majority vote was taken from the 3 predictions, and the result was adopted as the true value. (B) The IoU overlap with the next slice was measured for mitochondria segmentation of each slice from manual segmentation (black) or in the prediction either using the TAP + DL baseline (red) or only DL baseline (pink). The variances of the IoU are shown in the bottom table. Source data can be found in S2 Data. (C, D) 3D reconstructions of a part of mitochondria segmented by a human (C) and TAP method (D). (E) An overlay of cross-sectional views at the zx-plane (blue plane in C, D). Note that the human annotations missed the right-top area and were anomalous at the left. Scale bar, 100 nm. The raw EM data are deposited in the EMPIAR (EMPIAR-11449). EMPIAR, Electron Microscopy Public Image Archive; EM, electron microscopy; TAP, three-axes prediction; 3D, three-dimensional; IoU, intersection over union; DL, deep learning.
https://doi.org/10.1371/journal.pbio.3002246.g002
Efficient choice of training areas by human-in-the-loop iterative method Among current limitations in generating high-performance DL models is a difficulty in obtaining training datasets covering highly diverse features that represent objects of interest. Although drawing support tools have accelerated manual segmentation [18], choice of areas for training datasets has been random, which leads to redundant annotations of similar structures and ignorance of rare structures. To collect training data covering comprehensive features efficiently, we developed a DL scheme by employing human-in-the-loop (HITL) and pixel-level active learning (Fig 3A). To test the HITL scheme, we segmented mitochondria from 248 × 1,147 × 960 voxels FIB-SEM images of NIH3T3 cells at 10 × 10 × 10 nm per voxel. After the first round of learning with 3 manually segmented training datasets, the DL model successfully segmented structures with major mitochondrial features but failed to segment rare features (first prediction, Fig 3B). Subsequently, 3 sections with suboptimal prediction accuracy were identified and their predictions were manually corrected. The corrected sections were then incorporated as additional training data to the initial 3 training data sections for retraining the model (second prediction, Fig 3B). Another round of iteration with 4 more (a total of 10 out of 248) areas of the HITL-learning generated a DL model covering mitochondrial features highly comprehensively. The F1 score compared to the ground truth (GT) was calculated for the predicted results for entire slices, including the slices used as the training data, and reached 0.97 with the constant increase along three iterative cycles (0.906 after the first cycle and 0.924 after the second cycle), whereas a model generated from random 10 training areas exhibited a much lower F1 score (0.909). Among other advantages, this high F1 score reflects the appropriate separations of 2 mitochondria, which are often too closely apposed to be separated without specialized post-processing in previously reported automated mitochondrial segmentation or low cycles of HITL (Slice #4 in S1B Fig). These results suggest that the HITL-assisted choice of image areas for generating training datasets significantly increases the performance of models without increasing the volume of training data. PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 3. The HITL-TAP method on PHILOW improved the segmentation efficiency. (A) A diagram showing the HITL iterative workflow. The green rectangle indicates processes performed iteratively with human intervention. (B) Comparison of F1 scores between the HITL-mediated iterative learning and conventional DL. Crops of raw image only (left) or overlaid with predictions for mitochondria (magenta) are shown. The F1 score of each crop is shown at the right bottom. In the HITL-mediated iterative learning, annotations of 3 randomly picked areas were used as an initial training dataset. After the first prediction, the annotations on 3 image crops, including image (i), were corrected and combined with the initial training dataset. F1 scores of second prediction with this new training dataset were improved not only in the image (i) but also in the image (ii)–(iv). Four image crops including image (ii) were corrected after the second predictions and combined with the training dataset for the second prediction. After these cycles, the F1 scores were above 0.98 in (i)–(iv). In contrast, without HITL, even with the same number of training datasets, the F1 scores reached only 0.81–0.97. Magnifications of the areas marked with orange rectangles are shown in the bottom. Low confidence areas were highlighted in green to draw attention of the annotators for the manual correction. (C) Times required for correcting the mitochondrial prediction results obtained either inside (0–100 Mvoxel) and outside (100–150 Mvoxel) of the volume used for generating the training data by indicated methods. Magenta: HITL + TAP + DL baseline learning. Green: HITL + DL baseline learning. Dark blue: DL baseline (2D UNet++) only. Gray: without DL (Manual). The speed was calculated from the actual time required for correcting 150 Mvoxel (HITL + TAP + DL baseline, HITL + DL baseline, and DL baseline) or the time estimated from 0.3 Mvoxel of manual correction (Manual). Bars below the graph show the time required for making the training datasets. Numbers below each line indicate voxels visually inspected and corrected in 1 min (Mvoxel/minute). Source data can be found in S3 Data. (D) Representative 3D mitochondrial structures reconstructed from segmentations generated using HITL-TAP method on PHILOW. Scale bar, 500 nm. The raw EM data are deposited in the EMPIAR (EMPIAR-11449). EMPIAR, Electron Microscopy Public Image Archive; EM, electron microscopy; HITL, human-in-the-loop; TAP, three-axes prediction; 3D, three-dimensional; DL, deep learning; 2D, two-dimensional; PHILOW, Python-based human-in-the-loop workflow.
https://doi.org/10.1371/journal.pbio.3002246.g003
Improvement of mitochondrial segmentation by the HITL approach on PHILOW A roadblock for implementing the HITL-iterative scheme was a complicated set of file format conversions, export and import cycles of data across different applications, and accompanying file management. Thus, to circumvent this, we developed an open-source integrated analysis platform called PHILOW, equipped with a seamless graphical user interface (GUI) environment for annotation assistance, visualization, data management, model training, inference, and manual correction functions (S2 Fig). Also, by cloud-based model training and prediction, PHILOW enables an easy introduction of DL-based analyses without purchasing GPUs and associated software. PHILOW is available at github (
https://github.com/neurobiology-ut/PHILOW) and seamlessly incorporated into napari as a plugin. To evaluate the effect of seamless HITL implementation using PHILOW, we measured the total work time required for a complete 3D reconstruction of mitochondrial structures from 100 Mvoxel of 10 nm isotropic sSEM images (Fig 3C and S3 Data). Based on the time required for manual tracing of 0.9 Mvoxels, an estimated 3,546 min of human work would be required for a 100 Mvoxel manual reconstruction (Fig 3C and S3 Data). Application of a conventional DL algorithm, a 2D Unet++ model trained with randomly selected area (DL baseline), reduced the work time to 358 min (54 min for tracing of training data and 304 min of final correction). Strikingly, 3 iterative cycles of HITL (HITL + DL baseline) reduced the work time to 149 min (38 min for tracing of training data and 111 min of final correction). When the analysis was extended to neighboring image blocks, the efficiency was still better using HITL (0.84 Mvoxel/minutes, compared to 0.25 Mvoxel/minutes in DL baseline only). A combination of TAP and HITL on DL baseline (HITL-TAP) further reduced the work time (34 min for tracing of training data and 56 min of final correction) (Fig 3C and S3 Data). The same reduction of time required for segmentation with the HITL and TAP was observed for another human annotator (S3 Fig and S14 Data). Therefore, as the database becomes larger, the total time required is 9 to 10 times shorter. These results demonstrate that PHILOW-mediated seamless 3D prediction and correction cycles significantly increased the efficiency of 3D reconstructions of mitochondria structures from sSEM images by reducing time spent on the training data generation and proofreading (Fig 3D).
The HITL-TAP method enables precise segmentation of lamellar and tubular cristae Although crista structures have been proposed to dictate functions of the mitochondria, quantitative investigation of crista structure has been mostly limited to 2D or thin sections for ET. Several previous studies manually reconstructed three-dimensional (3D) crista structures, but mostly limited to the lamellar structures or simple cristae of algae and trypanosomes from small numbers of mitochondria since manual reconstruction is highly laborious, especially for tortuous and thin tubular structure [13,14,19–21]. Therefore, equipped with the HITL-TAP on PHILOW, we tested if it can efficiently segment crista structure including tubular structures from serial FIB-SEM images. We segmented the inner structures of 24 mitochondria of a total of 1,500 μm3 in 10 nm isotropic sSEM images by a single iterative HITL-TAP cycle that took a total 81 min of human work time (Fig 4A and S1 Movie). After removing objects smaller than 20 voxels as misannotations, HITL-TAP provided us with highly accurate segmentations of cristae. To examine if further proofreading by a human annotator was required, similarity between the predicted data and manual segmentations by 2 independent experts was examined by calculating F1 scores (Fig 4B). HITL iterations were performed on a 93 × 109 × 115 voxels volume, and a total of 7 slices out of 93 slices were used to train the model to predict all 93 slices. For lamellar cristae, F1 scores between the prediction and the manual segmentations were comparable to that between 2 manual segmentations (Fig 4B and 4C). Further, the precision score was even higher than those of manual segmentations (Fig 4C). These data suggest that the variances between the prediction and the manual segmentations are within the range of annotator-to-annotator variance. A post hoc visual inspection in 3D (from yz and zx-planes) found that the low precision score of manual segmentations was due to false-positive segmentations in marginal areas (Figs 4D, 4E, and S4A–S4E). Together with the smoother surface of the prediction as represented by smaller IoU variances in xy-planes (Fig 4F and S4 Data), we conclude that lamellar cristae segmentation by HITL-TAP method has superior reliability compared to human experts. Of note, the prediction by the HITL-TAP method included a significant portion of tubular cristae that both human experts missed, as represented by the low recall scores (0.399 and 0.539) of the manual segmentations against the prediction (Fig 4C). The post hoc 3D visual inspection confirmed that segmentation of tubular cristae was indeed more efficient than manual segmentation (Figs 4G–4J and S4B and S4C, arrow heads and S2 Movie). These results show that our HITL-TAP method achieved superhuman accuracy in segmenting both lamellar and tubular crista structures and did not require further human proofreading. PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 4. Prediction of crista structures with superhuman accuracy. (A) Representative crista structures in the mitochondria shown in Fig 3D. Yellow: lamellar structure, Cyan: tubular structure. Scale bar, 500 nm. Three random images of 24 mitochondria were prepared as initial training data for both lamellar and tubular cristae prediction. (B) Comparison of F1 scores among an HITL-TAP prediction and annotations by 2 human experts. Corresponding 3D reconstructed lamellar images are shown. The F1 scores between the HITL-TAP prediction and the annotations by human experts #1 and #2 are 0.787 and 0.817, respectively. This value is higher than the score between human experts (0.780). (C) IoU, F1 score, and precision/recall on tubular structures or lamellar structures between the prediction by the HITL-TAP algorithm and one of the 2 annotations by human experts. (D, E) Segmentation of lamellar structures by HITL-TAP algorithm and human annotators shown in xy-planes (D) and yz-planes (E). The lower panels show areas indicated by rectangles in the upper panels. Yellow: segmentations of lamellar structures. Note that segmentations by human annotators are broader. The narrower HITL-TAP segmentation is more accurate at the voxels highlighted with cyan judging from the continuity in the z-axis (E). (F) Z-axis continuities of lamellar structures segmented by HITL-TAP algorithm (red), human annotator #1 (black) and human annotator #2 (gray) were indicated by IoU between neighboring xy-planes. Source data can be found in S4 Data. (G) Segmentation of crista structures by HITL-TAP algorithm (transparent magenta) and a human annotator (green). The rectangle shows the area shown in (H). (H) Reconstruction of tubular cristae is shown with a slice of serial EM images. Note that the human annotation missed the structures HITL-TAP algorithm segmented (arrowheads). (I, J) yz-view EM images corresponding to the slice in (H) annotated by a human annotator (I) and HITL-TAP algorithm (J). Arrowheads are corresponding to those pointing the tubular structures in (H). The raw EM data are deposited in the EMPIAR (EMPIAR-11449). EMPIAR, Electron Microscopy Public Image Archive; HITL, human-in-the-loop; TAP, three-axes prediction; 3D, three-dimensional; IoU, intersection over union; EM, electron microscopy.
https://doi.org/10.1371/journal.pbio.3002246.g004
3D reconstruction of crista junctions from FIB-SEM images The regulation of the crista junction (CJ), a narrow attachment point of a crista to the inner boundary membrane (IBM), is critical for crista morphogenesis (Fig 5A). To investigate the structure of the CJ, methods for visualizing and quantifying it in 3D are required. First, we examined if the terminus of cristae contiguous with the mitochondrial surface in the FIB-SEM images could be defined as a CJ. Since it was reported that the distance between the outer mitochondrial membranes (OMM) and IBM becomes closer in the area juxtaposed with CJ, we tested if this is the case for the putative CJ defined as above. Although the resolution of FIB-SEM in this study was not sufficient for differentiating OMM and IBM, we defined the distance between them as the thickness of the membrane structure surrounding the mitochondria (S5A Fig and S15 Data). Strikingly, among randomly selected 100 putative CJs, 86% displayed a narrowing in the distance between the OMM and IBM on at least 1 side (S5B Fig). This suggests that the contiguity to the surface of the mitochondria is a reliable indicator for defining the CJ. Therefore, we next tested if CJ can be defined based on the HITL-TAP-based cristae segmentation. First, the terminus of cristae within 30 nm of the mitochondrial surface was detected as candidates of CJ. Then, CJs were selected from the candidates by examining their contiguity with corresponding mitochondrial surfaces (see Methods). By repeating this for CJ candidates within 40 and 50 nm of the mitochondrial surface, we determined CJs (Fig 5B and 5C and S3 Movie). Visual inspection of those CJs mapped on the reconstructed cristae confirmed that this method precisely and comprehensively detects CJs human annotator assigned as CJs. The magnified reconstructions reveal that CJs of lamellar cristae are elongated and linked, encompassing the considerable portion of the lamellar cristae boundary (Fig 5C). In contrast, the CJs of tubular cristae are situated in discrete clusters. These results indicate that the 3D reconstruction of FIB-SEM images by the HITL-TAP method is applicable for efficient detection of CJs. PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 5. 3D structure of crista junctions reconstructed from FIB-SEM images. (A) Diagram of mitochondrial subdomains. (B) 3D reconstructions of CJs (magenta) overlaid with tubular (cyan) or lamellar (yellow) cristae. Scale bars, 100 nm. (C) Magnified images of 3D reconstructed CJs and cristae. Scale bars, 100 nm. The raw EM data are deposited in the EMPIAR (EMPIAR-11449). CJ, crista junction; EMPIAR, Electron Microscopy Public Image Archive; EM, electron microscopy; FIB-SEM, focused ion beam-scanning electron microscopy.
https://doi.org/10.1371/journal.pbio.3002246.g005
[END]
---
[1] Url:
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3002246
Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.
via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/