(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .



The hair cell analysis toolbox is a precise and fully automated pipeline for whole cochlea hair cell quantification [1]

['Christopher J. Buswinka', 'Mass Eye', 'Ear', 'Harvard Medical School', 'Boston', 'Massachusetts', 'United States Of America', 'Speech', 'Hearing Bioscience', 'Technology Program']

Date: 2023-03

Abstract Our sense of hearing is mediated by sensory hair cells, precisely arranged and highly specialized cells subdivided into outer hair cells (OHCs) and inner hair cells (IHCs). Light microscopy tools allow for imaging of auditory hair cells along the full length of the cochlea, often yielding more data than feasible to manually analyze. Currently, there are no widely applicable tools for fast, unsupervised, unbiased, and comprehensive image analysis of auditory hair cells that work well either with imaging datasets containing an entire cochlea or smaller sampled regions. Here, we present a highly accurate machine learning-based hair cell analysis toolbox (HCAT) for the comprehensive analysis of whole cochleae (or smaller regions of interest) across light microscopy imaging modalities and species. The HCAT is a software that automates common image analysis tasks such as counting hair cells, classifying them by subtype (IHCs versus OHCs), determining their best frequency based on their location along the cochlea, and generating cochleograms. These automated tools remove a considerable barrier in cochlear image analysis, allowing for faster, unbiased, and more comprehensive data analysis practices. Furthermore, HCAT can serve as a template for deep learning-based detection tasks in other types of biological tissue: With some training data, HCAT’s core codebase can be trained to develop a custom deep learning detection model for any object on an image.

Citation: Buswinka CJ, Osgood RT, Simikyan RG, Rosenberg DB, Indzhykulian AA (2023) The hair cell analysis toolbox is a precise and fully automated pipeline for whole cochlea hair cell quantification. PLoS Biol 21(3): e3002041. https://doi.org/10.1371/journal.pbio.3002041 Academic Editor: Andy Groves, Baylor College of Medicine, UNITED STATES Received: November 15, 2022; Accepted: February 17, 2023; Published: March 22, 2023 Copyright: © 2023 Buswinka et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: The authors confirm that all data underlying the findings are fully available without restriction. All relevant data are within the paper and its Supporting Information files. All code has been hosted on github and is available for download at https://github.com/indzhykulianlab/hcat along with accompanying documentation at https://hcat.readthedocs.io/ The EPL cochlea frequency ImageJ plugin is available for download at: https://www.masseyeandear.org/research/otolaryngology/eaton-peabody-laboratories/histology-core. Funding: This work was supported by NIH R01DC020190 (NIDCD), R01DC017166 (NIDCD) and R01DC017166-04S1 "Administrative Supplement to Support Collaborations to Improve the AI/ML-Readiness of NIH-Supported Data" (Office of the Director, NIH) to A.A.I. and the Speech and Speech and Hearing Bioscience and Technology Program Training grant T32 DC000038 (NIDCD). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: The authors have declared that no competing interests exist. Abbreviations: FP, false positive; GUI, graphical user interface; HCAT, hair cell analysis toolbox; IHC, inner hair cell; NMS, non-maximum suppression; OHC, outer hair cell; TN, true negative; TP, true positive

Introduction The cochlea is the organ in the inner ear responsible for the detection of sound. It is tonotopically organized in an ascending spiral, with mechanosensitive sensory cells responding to high-frequency sounds at its base and low-frequency sounds at the apex. These mechanically sensitive cells of the cochlea, known as hair cells, are classified into 2 functional subtypes: outer hair cells (OHCs) that amplify sound vibrations and inner hair cells (IHCs) that convert these vibrations into neural signals [1]. Each hair cell carries a bundle of actin-rich microvillus-like protrusions called stereocilia. Hair cells are regularly organized into 1 row of IHCs and 3 (rarely 4) rows of OHCs within a sensory organ known as the organ of Corti [2]. The OHC stereocilia bundles are arranged in a characteristic V-shape and are composed of thinner stereocilia as compared to those of IHCs. Hair cells are essential for hearing, and deafness phenotypes are often characterized by their histopathology using high-magnification microscopy. The cochlea contains thousands of hair cells, organized over a large spatial area along the length of the organ of Corti. During histological analysis, each of these thousands of cells represents a datum that must be parsed from the image by hand ad nauseam. To accommodate for manual analysis, it is common to disregard all but a small subset of cells, sampling large datasets in representative tonotopic locations (often referred to as base, middle, and apex of the cochlea). To our knowledge, there are 2 existing automated hair cell counting algorithms to date, both of which have been developed for specific use cases, largely limiting their application for the wider hearing research community. One such algorithm by Urata and colleagues [3] relies on the homogeneity of structure in the organ of Corti and fails when irregularities, such as 4 rows of OHCs, are present. It is worth noting however, that their algorithm enables hair cell detection in 3D space, which may be critical for some applications [4]. Another algorithm, developed by Cortada et al [5] does not differentiate between IHCs and OHCs. Thus, each were limited in their application, likely impeding their widespread use [3,5]. The slow speed and tedium of manual analysis poses a significant barrier when faced with large datasets, be that analyzing whole cochlea instead of sampling 3 regions, or those generated through studies involving high-throughput screening [6,7]. Furthermore, manual analyses can be fraught with user error, biases, sample-to-sample inconsistencies, and variability between individuals performing the analysis. These challenges highlight a need for unbiased, automated image analysis on a single-cell level across the entire frequency spectrum of hearing. Over the past decade, considerable advancements have been made in deep learning approaches for object detection [8]. The predominant approach is Faster R-CNN [9], a deep learning algorithm that quickly recognizes the location and position of objects in an image. While originally designed for use with images collected by conventional means (camera), there has been success in applying the same architecture to biomedical image analysis tasks [10–12]. This algorithm can be adapted and trained to perform such tasks orders of magnitude faster than manual analysis. We have created a machine learning-based analysis software that quickly and automatically detects each hair cell, determines its type (IHC versus OHC), and estimates cell’s best frequency based on its location along the cochlear coil. Here, we present a suite of tools for cochlear hair cell image analysis, the hair cell analysis toolbox (HCAT), a consolidated software that enables fully unsupervised hair cell detection and cochleogram generation.

Discussion Here, we present the first fully automated cochlear hair cell analysis pipeline for analyzing multiple micrographs of cochleae, quickly detecting and classifying hair cells. HCAT can analyze whole cochleae or individual regions and can be easily integrated into existing experimental workflows. While there were previous attempts at automating this analysis, each were limited in their use to achieve widespread application [3,5]. HCAT allows for unbiased, automated hair cell analysis with detection accuracy levels approaching that of human experts at a speed so significantly faster that it is desirable even with rare errors. Furthermore, we validate HCAT on data from various laboratories and find it is accurate across different imaging modalities, staining, age, and species. Deep learning-based detection infers information from the pixels of an image to make decisions about what objects are and where they are located. To this end, the information is devoid of any context. HCAT’s deep learning detection model was trained largely using anti-MYO7A and phalloidin labels; however, the model can perform on specimens labeled with other markers, as long as they are visually similar to examples in our training data. For example, some of the validation images of cochlear hair cells sampled from published figures contained cell body label other than MYO7A, such as Calbindin [32], Calcineurin [33], and p-AMPKα [34], while in other images, phalloidin staining of stereocilia bundle was substituted by anti-espin [35] labeling. Although no images containing hair cell-specific nuclear markers, such as pou4f3 [36], were included in the pool training data, HCAT performed reasonably well when tested on such images, especially when they also contained a bundle stain. Of higher importance is the quality of the imaging data: proper focus adjustment, high signal-to-noise ratio, image resolution, and adequately adjusted brightness and contrast settings. Furthermore, the quality of the training dataset greatly affects model performance; upon validation, HCAT performed slightly worse when evaluated on community provided datasets due to fewer representative examples within the pool of our training data. We will strive to periodically update our published model when new data arise, further improving performance over time. At present, HCAT has proven to be sufficiently accurate to consistently replicate major findings even with occasional discrepancies to a manual analysis, even when used on datasets that were collected without any optimization for automated analysis. The strength of this software is in automation, allowing for processing thousands of hair cells over the entire cochlear coil without human input. Recent advancements in tissue-clearing techniques enable the acquisition of the intact 3D architecture of the cochlear coil using confocal or two-photon laser scanning microscopy allow for further development of the HCAT tool in the future as the wealth of such imaging data are made available to the public. Although no tissue-cleared data were used to develop HCAT, we tested it on few published examples of tissue-cleared mouse and pig cochlear imaging data [3,4]. While HCAT showed reasonable hair cell detection rates, the tool was unable to perform as accurate as we report for high-resolution confocal imaging data, most likely because the tissue-cleared datasets were collected at lower resolution (0.65 to 0.99 μm/pix), and contained only anti-MYO7a fluorescence. It is common for the population of missing cells, rather than absolute counts, to be reported in cell survival studies. We were unable to support missing cell detection or quantification in HCAT. We found there lacked sufficient, and robust information on the locations of missing cells to automate their detection consistently and accurately. In some cases, a distinctive “X-shaped” phalangeal scar may be seen in the sensory epithelium following hair cell loss [37,38] that may be sufficient to determine the presence of a missing cell; however, this is often visible with an actin stain or on scanning electron microscopy images, and not so in the other pathologic cases HCAT attempts to support. While the detection model was trained and cochlear path estimation designed specifically for cochlear tissue, HCAT can serve as a template for deep learning-based detection tasks in other types of biological tissue in the future. While developing HCAT, we employed best practices in model training, data annotation, and augmentation. With minimal adjustment and a small amount of training data, one could adapt the core codebase of HCAT to train and apply a custom deep learning detection model for any object in an image. To our knowledge, this is the first whole cochlear analysis pipeline capable of accurately and quickly detecting and classifying cochlear hair cells. This HCAT enables expedited cochlear imaging data analysis while maintaining high accuracy. This highly accurate and unsupervised data analysis approach will both facilitate ease of research and improve experimental rigor in the field.

Materials and methods Preparation and imaging of in-house training data Organs of Corti were dissected in 1 contiguous piece at P5 in Leibovitz’s L-15 culture medium (21083–027, Thermo Fisher Scientific) and fixed in 4% formaldehyde for 1 h. The samples were permeabilized with 0.2% Triton-X for 30 min and blocked with 10% goat serum in calcium-free HBSS for 2 h. To visualize the hair cells, samples were labeled with an anti-Myosin 7A antibody (#25–6790 Proteus Biosciences, 1:400) and goat anti-rabbit CF568 (Biotium) secondary antibody. Additionally, samples were labeled with Phalloidin to visualize actin filaments (Biotium CF640R Phalloidin). Samples were then flattened into 1 turn, mounted on slides using ProLong Diamond Antifade Mounting kit (P36965, Thermo Fisher Scientific), and imaged with a Leica SP8 confocal microscope (Leica Microsystems) using a 63×, 1.3 NA objective. Confocal Z-stacks of 512 × 512 pixel images with an effective pixel size of 288 nm were collected using the tiling functionality of the Leica LASX acquisition software and maximum intensity projected to form 2D images. All experiments were carried out in compliance with ethical regulations and approved by the Animal Care Committee of Massachusetts Eye and Ear. Training data Varied data are required for the training of generalizable deep learning models. In addition to imaging data collected in our lab, we sourced generous contributions from the larger hearing research community from previously reported [7,31,39–46], and in some cases unpublished, studies. Bounding boxes for hair cells seen in maximum intensity projected z-stacks were manually annotated using the labelImg [47] software and saved as an XML file. For whole cochlear cell annotation, a “human in the loop” approach was taken, first evaluating the deep learning model on the entire cochlea, visually inspecting it, then manually correcting errors. Our dataset contained examples from 3 different species, multiple ages, microscopy types, and experimental conditions. Only the images generated in-house contain an entire, intact cochlea. A summary of our training data is presented in Table 1. Training procedure The deep learning architectures were trained with the AdamW [48] optimizer with a learning rate starting at 1 × 10−4 and decaying based on cosine annealing with warm restarts with a period of 10,000 epochs. In cases with a small number of training images, deep learning models tend to fail to generalize and instead “memorize” the training data. To avoid this, we made heavy use of image transformations that randomly add variability to the original set of training images and synthetically increase the variety of our training datasets [49] (S2 Fig). Hyperparameter optimization Eight manually annotated cochleae were evaluated with the Faster R-CNN detection algorithm without either rejection method (via detection confidence or non-maximum suppression). A grid search was performed by breaking each threshold value into 100 steps from 0 to 1, and each combination applied to the resulting cell detections, reducing their number, then calculating the true positive (TP), true negative (TN), and false positive (FP) rates (S1D and S1E Fig). An accuracy metric of the TP minus both TN and FP was calculated and averaged for each cochlea. The combinations of values that produce the highest accuracy metric were then chosen as default for the HCAT algorithm. Computational environment HCAT is operating system agnostic, requires at least 8 GB of system memory, and optionally an NVIDIA GPU with at least 8 GB of video memory to optional GPU acceleration. All scripts were run on an analysis computer running Ubuntu 20.04.1 LTS, an open-source Linux distribution from Canonical based on Debian. The workstation was equipped with 2 Nvidia A6000 graphics cards for a total of 98 GB of video memory. Many scripts were custom written in python 3.9 using open-source scientific computation libraries including numpy [50], matplotlib, and scikit-learn [51]. All deep learning architectures, training logic, and much of the data transformation pipeline was written in pytorch [52] and making heavy use of the torchvision [52] library.

Supporting information S1 Fig. Validation of hair cell detection analysis and location estimation. Whole cochlear turns (A) were manually annotated and evaluated with the HCAT detection analysis pipeline. Each analysis generated cochleograms (B), reporting the “ground truth” result obtained from manual segmentation (dark lines) superimposed onto the cochleogram generated from hair cells detected by the HCAT analysis (light lines). The best frequency estimation error was calculated as an octave difference of predicted best frequency for every hair cell versus their manually assigned frequency using the ImageJ plugin (C). Optimal cell detection and non-maximum suppression thresholds were discerned via a grid search by maximizing the true positive rate penalized by the false positive and false negative rates (D). Black lines on the curves (E) denote the optimal hyperparameter value. https://doi.org/10.1371/journal.pbio.3002041.s001 (EPS) S2 Fig. Training data augmentation pipeline. Training images underwent data augmentation steps, increasing the variability of our dataset and improving resulting model performance. Examples of each transformation are shown on exemplar grids (bottom). Each of these augmentation steps was probabilistically applied sequentially (left to right, as shown by arrows) during every epoch. https://doi.org/10.1371/journal.pbio.3002041.s002 (EPS) S3 Fig. Multi-piece cochleogram generation workflow for HCAT. Adult murine cochlear dissection, depending on technique used, typically produces up to 6 individual pieces of tissue, numbered from apex to base in (A). These pieces form the entirety of the organ of Corti and can be analyzed by HCAT (B). First, each piece must have its curvature annotated manually in ImageJ from base to apex (C) using the EPL cochlea frequency ImageJ plugin. Then, these annotations and images are passed one-at-a-time to the HCAT command line interface. This will generate a CSV for each file, which are then manually compiled (E). This allows for the generation of a complete cochleogram from a multi-piece dissection (F). https://doi.org/10.1371/journal.pbio.3002041.s003 (EPS) S1 Data. A compressed folder with spreadsheets containing, in separate files, the underlying numerical data and statistical analysis for Figs 1G, 3E, 3F, 3G, 4F, 4H, 5A, 5B, 5C, 6A, 6B, S1B, S1C, S1D, S1E, S1F. https://doi.org/10.1371/journal.pbio.3002041.s004 (ZIP)

Acknowledgments We would like to thank Dr. Marcelo Cicconet (Image and Data Analysis Core at Harvard Medical School) and Haobing Wang, MS (Mass Eye and Ear Light Microscopy Imaging Core Facility) for their assistance in this project. We thank Dr. Lisa Cunningham, Dr. Michael Deans, Dr. Albert Edge, Dr. Katharine Fernandez, Dr. Ksenia Gnedeva, Dr. Yushi Hayashi, Dr. Tejbeer Kaur, Dr. Jinkyung Kim, Prof. Corne Kros, Dr. M. Charles Liberman, Dr. Vijayprakash Manickam, Dr. Anthony Ricci, Prof. Guy Richardson, Dr. Mark Rutherford, Dr. Basile Tarchini, Dr. Amandine Jarysta, Dr. Bradley Walters, Dr. Adele Moatti, Dr. Alon Greenbaum, the members of their teams and all other research groups, for providing their datasets to evaluate the HCAT. We thank Hidetomi Nitta, Emily Nguyen, and Ella Wesson for their assistance in generating a portion of training data annotations. We also thank Evan Hale and Corena Loeb for critical reading of the manuscript.

[END]
---
[1] Url: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3002041

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/