(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .



CellTracksColab is a platform that enables compilation, analysis, and exploration of cell tracking data [1]

['Estibaliz Gómez-De-Mariscal', 'Instituto Gulbenkian De Ciência', 'Oeiras', 'Hanna Grobe', 'Faculty Of Science', 'Engineering', 'Cell Biology', 'Åbo Akademi University', 'Turku', 'Joanna W. Pylvänäinen']

Date: 2024-08

In life sciences, tracking objects from movies enables researchers to quantify the behavior of single particles, organelles, bacteria, cells, and even whole animals. While numerous tools now allow automated tracking from video, a significant challenge persists in compiling, analyzing, and exploring the large datasets generated by these approaches. Here, we introduce CellTracksColab, a platform tailored to simplify the exploration and analysis of cell tracking data. CellTracksColab facilitates the compiling and analysis of results across multiple fields of view, conditions, and repeats, ensuring a holistic dataset overview. CellTracksColab also harnesses the power of high-dimensional data reduction and clustering, enabling researchers to identify distinct behavioral patterns and trends without bias. Finally, CellTracksColab also includes specialized analysis modules enabling spatial analyses (clustering, proximity to specific regions of interest). We demonstrate CellTracksColab capabilities with 3 use cases, including T cells and cancer cell migration, as well as filopodia dynamics. CellTracksColab is available for the broader scientific community at https://github.com/CellMigrationLab/CellTracksColab .

Funding: This study was supported by the Research Council of Finland (338537 to GJ, https://www.aka.fi/en/ ), the Sigrid Jusélius Foundation (to GJ, https://www.sigridjuselius.fi/en/ ), the Cancer Society of Finland (Syöpäjärjestöt; to GJ, https://www.cancersociety.fi/ ), and the Solutions for Health strategic funding to Åbo Akademi University (to GJ, https://www.abo.fi/en/solutions-for-health/ ). This research was supported by the InFLAMES Flagship Programme of the Academy of Finland (decision numbers: 337530, 337531, 357910, and 357911, https://www.aka.fi/en/ ). EGM and RH received funding from the European Union through the Horizon Europe program (AI4LIFE project with grant agreement 101057970-AI4LIFE, and RT-SuperES project with grant agreement 101099654-RTSuperES to RH, https://research-and-innovation.ec.europa.eu/funding/funding-opportunities/funding-programmes-and-open-calls/horizon-europe_en ). EGM and RH also acknowledge the support of the Gulbenkian Foundation (Fundação Calouste Gulbenkian, https://gulbenkian.pt/en/ ) and the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement No. 101001332 to RH, https://erc.europa.eu/homepage ). LX received funding from the INCEPTION project (PIA/ANR-16-CONV-0005, https://anr.fr/ ) and is a student from the FIRE PhD program funded by the Bettencourt Schueller Foundation ( https://www.fondationbs.org/en ) and the EURIP graduate program (ANR-17-EURE-0012, https://www.learningplanetinstitute.org/en/eurip-graduate-school/ ). This study was also supported by France BioImaging (Investissement d’Avenir; ANR-10-INBS-04, J.-Y. T., LX, https://france-bioimaging.org/ ). This work was also supported by the European Molecular Biology Organization (EMBO, https://www.embo.org/ ) Installation Grant (EMBO-2020-IG-4734 to RH), the EMBO Postdoctoral Fellowship (EMBO ALTF 174-2022 to EGM), the Chan Zuckerberg Initiative ( https://chanzuckerberg.com/ ) Visual Proteomics Grant (vpi-0000000044 with DOI: 10.37921/743590vtudfp to RH). RH also acknowledges the support of LS4FUTURE Associated Laboratory (LA/P/0087/2020, https://www.ls4future.pt/ ). The open access publication fees were funded by the Gösta Branders research fund, Åbo Akademi Research Foundation (Gösta Branders forskningsfond, Stiftelsen för Åbo Akademi, https://stiftelsenabo.fi/en/home/ ). While the European Union funded this study, the views and opinions expressed are those of the authors only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible. None of the funders listed above were involved in the design and execution of this study.

Data Availability: Multiple test datasets are available on Zenodo. They include the two test datasets that can be directly downloaded from within the CellTracksColab notebooks (8413510, 8420011). In addition, the three datasets showcased in this study, their tracking files, and the CellTracksColab results are also available on Zenodo (11282716, 11286110, 11285514). The code for CellTracksColab is publicly available under the MIT license, encouraging broad utilization and adaptation. CellTracksColab's GitHub repository serves as a dynamic platform for tracking the evolution of the code across various versions. Users are encouraged to report issues and suggest features directly through the GitHub interface. A stable version of the code and associated documentation is also archived on Zenodo (11384844).

Here, we present CellTracksColab, a Python-based platform to streamline the analysis of tracking datasets. This platform is specifically designed for researchers, particularly those with limited programming expertise, facilitating the exploration and analysis of tracking data. CellTracksColab leverages the power of Jupyter notebooks, which blend live code execution with comprehensive documentation access. CellTracksColab can run locally and in the cloud, accommodating diverse user preferences and resource availability. Drawing on successful models like ColabFold [ 10 ] and ZeroCostDL4Mic [ 11 ], CellTracksColab is fully integrated within the Google Colaboratory framework (Colab). Through a simplified workflow, researchers can install essential software dependencies with a few mouse clicks, upload their tracking data, and run their analyses. CellTracksColab extends beyond visualization and population analyses, empowering researchers to delve into the nuanced dynamics and behaviors encapsulated within their tracking experiments. We first describe CellTracksColab’s architecture. Then, we demonstrate CellTracksColab features and capabilities in studying T cells and cancer cell migration and filopodia dynamics.

Multiple tools have been developed to help researchers compile tracking data; for instance, these include the Ibidi Chemotaxis tool (a Fiji plugin), the MotilityLab website (an online platform for CelltrackR [ 5 ]), or TrackMateR [ 6 ]. These traditional analytical approaches, implemented by us and many others, typically reduce tracking datasets to population-level analyses where track metrics are averaged across different conditions. Yet, while practical, such analyses overlook the heterogeneity within biological data. Over the past 2 years, multiple tools, including CellPhe (an R toolbox [ 7 ]), Traject3D (a collection of MATLAB scripts [ 8 ]), and CellPlato (a Python toolbox [ 9 ]), have been designed to harness the high dimensionality of tracking datasets to assist in the unbiased discovery of rare phenotypes. Still, these tools often remain difficult to implement for users with no or little coding expertise.

In life science, tracking has emerged as an indispensable tool for unparalleled insights into dynamic molecular and cellular behaviors. Parallel to this, segmentation methods relying on machine learning and deep learning are now greatly facilitating the implementation of complex tracking pipelines [ 1 – 4 ], enabling the quantitative analysis of these dynamic behaviors. Yet, as the capabilities of tracking tools have expanded, so too have the challenges associated with analyzing the resulting data.

Results

The CellTracksColab framework The CellTracksColab platform comprises a collection of Jupyter notebooks designed to streamline tracking data analysis (Fig 1A). CellTracksColab can be run locally or in cloud services such as Google Colab, which provides users free access to computing resources that simplify the user experience by eliminating the need for local installations. PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 1. The CellTracksColab platform. (A) Schematic representation of the CellTracksColab workflow. (B) Visualization of tracks in a CellTracksColab notebook. (C) Statistical analysis of track metrics using CellTracksColab. This figure shows the analysis of breast cancer cell migration (expressing CTRL shRNA or MYO10-targeting shRNA) in different environments beneath a collagen gel and standard media. The directionality metric is presented in a Tukey boxplot format. Vertical whiskers extend to data points within 1.5× the interquartile range. Each biological replicate is uniquely color-coded for clarity. Accompanying the plot are mirrored heatmaps that illustrate the effect size (Cohen’s d value) and statistical significance (p-values from randomization tests) across various conditions. Underlying numerical data can be found in S1 Data. (D) Dimensionality reduction and clustering visualization in CellTracksColab. This panel displays a 2D t-SNE projection of the entire dataset, utilizing comprehensive track metrics for the analysis. Data points are color-coded to reflect cluster groups identified through HDBSCAN analysis on the t-SNE projection, providing insights into track characteristics and similarities. (E) Spatial clustering analysis using Ripley’s L function and Monte Carlo simulations in CellTracksColab. This graph illustrates the spatial distribution of tracks, where a blue curve above the zero line indicates clustering at a specific radius in the field of view. The Monte Carlo simulation results are included to assess the statistical significance of the observed patterns. (F) Measurement and analysis of object-to-region proximity using CellTracksColab. This example demonstrates the platform’s utility in quantifying the distance of objects (marked as yellow dots) relative to a defined region of interest (denoted by the white edge). The tool allows tracking these distances over time and computing related metrics, facilitating in-depth spatial analysis. https://doi.org/10.1371/journal.pbio.3002740.g001 CellTracksColab is designed to process tracking data from various open-source tracking software, including TrackMate [1], CellProfiler [12], Icy [13], ilastik [14], and the Fiji Manual Tracker [15]. CellTracksColab supports tracking data stored in XML (TrackMate) and CSV formats (TrackMate, CellProfiler, Icy, ilastik, and Fiji Manual Tracker). CellTracksColab can also be made compatible with other tracking tools that export results that follow our minimal requirements (see documentation for details). To facilitate a structured analysis, users are advised to organize their files into directories representing different experimental conditions and biological repeats. This organizational strategy is crucial for accurately categorizing and analyzing the dataset, considering various aspects such as experimental conditions, biological replicates, and fields of view. By promoting structured data management, CellTracksColab streamlines the analytical process and enhances the exploration and understanding of data variability and heterogeneity across the dataset. The performance of CellTracksColab is limited by the resources available to the user, particularly the amount of RAM available, which can limit the volume of data that can be processed. However, optimization of the underlying code has been executed to ensure maximal efficiency in resource utilization. For instance, we analyzed more than 50,000 tracks (>3 million objects from 117 videos) using CellTracksColab and the free version of Google Colab (all results presented in the manuscript can be replicated with the free version of Google Colab). CellTracksColab could accommodate one of our larger datasets encompassing over 536,000 tracks (>56 million objects from 300 videos), but this required the additional RAM that Google Colab Plus provides.

Analyzing data using CellTracksColab When the tracking data are loaded into CellTracksColab, these are automatically exported into the CellTracksColab format. This standardized format ensures consistent data access and manipulation, facilitating thorough analysis of the tracking data across the platform. Once exported, users can, for example, visualize (Fig 1B), filter, and smooth tracks (S1 Fig). Track smoothing using moving averages can prove to be particularly beneficial before the computation of directionality metrics, especially when the tracked object exhibits jitteriness (for instance, nuclei) and the user’s interest lies in discerning the overall movement of the cell. Upon loading data, CellTracksColab can compute various track metrics or import them directly from prior analyses conducted in the tracking software (Fig 1A). This is ensured by the flexible design of the CellTracksColab format, which provides the aggregation of additional metrics without affecting the content of the original dataset. Users can then generate boxplots illustrating the distribution of different track metrics of interest (Fig 1C). Additionally, several relevant statistical metrics are calculated, such as Cohen’s d value—which quantifies the standardized effect size between groups and is less sensitive to sample size variations—and the p-values of statistical hypothesis tests—which compare the distribution of track metrics across conditions. The statistical tests available include a randomization test that assesses the distribution of Cohen’s d values obtained with bootstrapping and t tests that compare the mean value distributions obtained from bootstrapping, following the SuperPlots methodology [16]. Both tests are available with and without Bonferroni correction, which adjusts the p-value to account for multiple comparisons (Fig 1C). CellTracksColab also enables users to perform quality control on their dataset, such as checking that their data are balanced between repeats and conditions. Namely, the user can resample unbalanced data before plotting the track metrics of interest. In addition, CellTracksColab can compute similarities across different experimental conditions and replicates using various track metrics to ensure data reliability and meaningful analysis. The results are visualized using hierarchical clustering in the form of dendrograms, which aids in comparing similarities within and across different conditions and identifying outliers. Inspired by CellPlato [9], CellTracksColab integrates Uniform Manifold Approximation and Projection (UMAP) or t-distributed Stochastic Neighbor Embedding (t-SNE) combined with Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) to explore the inherent heterogeneity within tracking datasets unbiasedly (Fig 1D). This combination allows for dimensionality reduction and effective clustering of tracks. The platform provides capabilities to plot track metrics for each cluster, creates heatmaps for an overview of data variability, and identifies exemplar tracks, representatives of each cluster, for detailed analysis. CellTracksColab also includes specialized spatial analysis modules that enable the spatial analysis of track data. These modules enable, for instance, the assessment of track clustering (Fig 1E) or calculating the proximity of tracks to specific regions of interest (Fig 1F). These tools facilitate the discovery of distinct subpopulations or behaviors within the data and also serve dual purposes: identifying actual clusters and categorizing data for comparative fingerprinting. Importantly, PDF files of all plots and CSV files encapsulating all plot data are exported, enabling users to visualize and revisit the results using their preferred software platforms. Due to the platform’s flexible design, we anticipate the addition of new analysis modules, both by our team and the user community.

Exploring T cell migration using CellTracksColab To showcase the capabilities of CellTracksColab, we first chose to reanalyze a small dataset of T cells migrating on either vascular cell adhesion protein 1 (VCAM) or intercellular adhesion molecule 1 (ICAM), captured through brightfield microscopy (Fig 2A) [1,17,18]. Automated cell tracking was achieved using StarDist and TrackMate algorithms [1]. The dataset encompasses 10 videos spread across 2 conditions and 3 biological repeats. CellTracksColab compiled this dataset using Colab in a few seconds, incorporating 2,297 tracks and 38,852 tracked objects. After the computation of additional metrics, we first evaluated the dataset’s balance and variability across different fields of view. Although the dataset exhibited some condition imbalance, we opted against resampling due to its relatively small size (S2A Fig). Intriguingly, the field of view (FOV)-based clustering analysis unveiled an unexpected alignment between 2 FOVs from the ICAM condition with those from the VCAM condition, hinting at potential similarities in cell tracking patterns (S2B Fig). The condition and repeat-based clustering analysis further corroborated this observation. Specifically, the analysis revealed that the ICAM second biological repeat displayed a clustering pattern remarkably similar to those observed in VCAM repeats (S2C Fig). This analysis indicates that this particular ICAM biological repeat does not behave as the other two, providing valuable information. We further utilized CellTracksColab to plot key track metrics, including mean speed and directionality of T cell migration. Our analysis confirmed that T cells exhibit slower and less directional movement on VCAM than on ICAM surfaces (Fig 2B). To delve deeper, we employed UMAP for dimensionality reduction, followed by HDBSCAN clustering. This approach revealed the presence of at least 5 distinct behavioral clusters within the cell population, suggesting varied migration patterns among T cells (Fig 2C and 2D). A fingerprinting plot then provided insights into the distribution of ICAM and VCAM tracks across these clusters, highlighting differing proportions (Fig 2E). A notable observation was the much higher percentage of ICAM tracks in cluster 3 compared to VCAM and a higher percentage of VCAM tracks in cluster 1 compared to ICAM. CellTracksColab generates a heatmap representing the Z-score of available track metrics for each cluster to facilitate rapid metric comparison across clusters (Fig 2F). Cluster 3 comprises fast and more directional tracks. In contrast, cluster 1 primarily comprises very slow-migrating cells (Fig 2G). Finally, we compared track metrics between the ICAM and VCAM conditions within specific clusters. Focusing solely on tracks in cluster 4 (a cluster composed of migrating cells), we observed that among motile cells, cells plated on ICAM migrated faster and tended to be more circular than those on VCAM (Fig 2H). While we provide only brief examples here, we can delve deeper into the analysis, identify tracks belonging to each cluster, and match them back to the original video. Further analyses will depend on the user’s interest in the biological phenomenon studied. This multifaceted analysis underscores CellTracksColab’s utility in offering nuanced insights into cell migration dynamics under different conditions. PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 2. Exploring T cell migration using CellTracksColab. (A) T cells plated on ICAM were recorded using a brightfield microscope and automatically tracked using StarDist and TrackMate. Detected cells (in magenta) and their tracks (colors indicate track ID) are displayed. Scale bar: 100 μm. (B) The “track mean speed” and track “directionality” metrics for each condition are summarized in Tukey boxplots. The effect size (d, Cohen’s d value) and the statistical significance (p, p-values from randomization tests) between the conditions are displayed. (C) 2D UMAP projection of the entire dataset. Data points are color-coded based on VCAM and ICAM conditions. (D) Resultant clusters from the HDBSCAN analysis on the 2D UMAP projection. Euclidean distance served as the metric for clustering. Each identified cluster is color-coded. (E) Fingerprint plot showcasing the distribution percentage of track in each cluster across different conditions. (F) Heatmap representation, normalized using Z-scores, displaying variations in selected track metrics among the clusters. Full heatmaps are available in the Zenodo archive of this dataset. (G) The “track mean speed,” track “directionality,” and “mean (cell) circularity” metrics for each cluster are summarized in a Tukey boxplot format as in (B). (H) The “track mean speed,” track “directionality,” and “mean (cell) circularity” metrics for each condition for cluster 4 are summarized in a Tukey boxplot format as in (B). For all box plots, the vertical whiskers extend to data points within 1.5× the interquartile range, and the values for each track are shown as dots. Each biological replicate is displayed next to each other from R1 to R3 (left to right). Plot axes are limited to 10× the interquartile range. (B, G, and H) Underlying numerical data can be found in S1 Data. The dataset, including the raw images, the tracking files, and all the CellTracksColab results (including numerical data), are also available on Zenodo (11286110). https://doi.org/10.1371/journal.pbio.3002740.g002

[END]
---
[1] Url: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3002740

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/