(C) PLOS One

(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .

Network feature-based phenotyping of leaf venation robustly reconstructs the latent space [1]

['Kohei Iwamasa', 'Department Of Biology', 'Kyushu University', 'Fukuoka', 'Koji Noshita', 'Plant Frontier Research Center']

Date: 2023-08

Leaf vein networks were extracted as undirected graphs from untreated leaf images (Ficus erecta, Zelkova serrata, Quercus acutissima, Prunus × yedoensis, and Morella rubra; first five rows) and cleared leaf images (Ficus pumila, Zelkova serrata, Quercus chrysolepis, Prunus pauciflora, and Morella cerifera; last five rows). Cleared leaf images in this figure were based on partially modified images of Specimen Numbers U0604, U4304, T1574, T1426, and U0644 in the NMNS Cleared Leaf Database [ 36 ]. Original leaf images, segmented vein images, undirected graphs, and magnified portions correspond to columns. Scale bars represent 10 mm.

Leaf images of sufficient quality for analyses were successfully captured using the proposed digitizing workflow ( Fig 4 ). The application of extraction steps, that is, segmentation, skeletonization, graph representation, and network feature calculation, to untreated and cleared leaf images took 12.5 and 51.3 seconds per leaf, respectively, using an Intel(R) Core i9-10900K CPU @ 3.70GHz and an NVIDIA RTX3090 GPU. Higher-order veins (i.e., most third- and some fourth-order veins) were recognized in our qualitative observations. The average leaf areas were 33.91 cm 2 for Q. acutissima, 11.74 cm 2 for Z. serrata, 30.66 cm 2 for P. × yedoensis, 13.08 cm 2 for M. rubra, and 31.56 cm 2 for F. erecta.

The slope of each regression line was 2.37 for the untreated and 3.13 for the cleared leaves (number of nodes vs. number of edges); 362.2 for untreated and 2268.5 for cleared leaves (leaf area vs. the number of nodes); 806.3 for the untreated and 6950.1 for the cleared leaf (leaf area vs. the number of edges).

The numbers of nodes and edges were highly correlated, with correlation coefficients of 0.968 for the untreated leaf dataset and 0.992 for the cleared leaf dataset ( Fig 5 ). Although both the numbers of nodes and edges were also correlated with leaf areas, the numbers of nodes and edges per area of untreated images were underestimated by more than five times compared to those in cleared images ( Fig 5B and 5C ). Their underestimates could be explained by the inability to clearly distinguish between higher-order veins, especially above the 3rd, from the leaf tissues in the untreated images.

Although sample preparation differed between cleared and untreated leaf images, the proposed method was able to extract veins with greater clarity than that obtained by simple image processing, while reducing false positives ( Fig 4 ). Skeletonization from the vein images successfully yielded a one-pixel representation of the vein structure while preserving the connectivity and loops. After skeletonization, images were converted to undirected graphs, all of which were indexed and connected ( Fig 4 ).

The random forest classification using network features for the untreated leaf dataset resulted in an accuracy of 90.6% ( Fig 6 ). The recall rates were 84.7% for Q. acutissima, 97.6% for Z. serrata, 91.7% for P. × yedoensis, 79.6% for M. rubra, and 89.4% for F. erecta, and the precision estimates were 83.8% for Q. acutissima, 97.6% for Z. serrata, 85.7% for P. × yedoensis, 79.6% for M. rubra, and 95.4% for F. erecta.

Empirical morphospace analysis on leaf veins

The empirical morphospace of leaf venation was reconstructed by a PCA based on the network features of the cleared and untreated leaves (Fig 7A). The first three principal components (PC1, PC2, and PC3) of the untreated and cleared leaves explained 88.7% and 91.8% of the total variances, respectively. In the untreated leaf dataset, Z. serrata was concentrated in the region with low PC1 values. F. erecta, Q. acutissim, and P. × yedoensis were distributed in regions with high PC1 values but were separated along PC2. In the cleared leaf dataset, Quercus was broadly distributed with PC1 values greater than -5. The other genera were generally distributed in the region with negative scores for PC1, and Ficus and Prunus showed relatively low and high scores for PC2, respectively.

PPT PowerPoint slide

PNG larger image

TIFF original image Download: Fig 7. Empirical morphospace of leaf venation. (A) Principal component analysis (PCA) of vein network features and contribution rates for the untreated and cleared leaf datasets. Each clade shows a biased distribution, and whole specimens were distributed along one-dimensional U-shaped curves in PC1–PC3 spaces of both datasets. (B) Scatter plot of PC1 and PC3 scores of the cleared leaves with the cropped image of the vein as markers. The degree of loopiness of vein networks changes along a U-shaped curve. (C) Histograms of the mean of the neighborhood of egonet (1-hop) of 10 samples for the lowest PC1, lowest PC3, and highest PC1 in the cleared leaf dataset and untreated dataset. Error bars represent the standard deviation. https://doi.org/10.1371/journal.pcbi.1010581.g007

PC1 represented the loopiness (i.e., the relative frequency of looping structures, known as cycles in graph theory, in a leaf vein network) of higher-order veins (Fig 7B). However, opposite orders for the correspondence of the PC1 axis to the degree of loopiness were observed for untreated and cleared leaf datasets; higher PC1 results in lower loopiness for the untreated leaf dataset; conversely, higher PC1 results in higher loopiness for cleared leaf dataset. Specimens in the untreated leaf dataset had few endpoint nodes and many loops in the higher-order veins when the PC1 scores became large, and specimens in the cleared leaf dataset had more loops when the PC1 scores became low (see the distributions of genera in Fig 7A). The network features related to the egonet showed a high factor loading for PC1, with the number of neighbors of the one-hop egonet significantly contributing to the local looping structures in the vein networks. Vein networks were represented as undirected graphs without crossing edges using algorithms for node extraction and edge detection (Fig 3), resulting in a concentration of one edge per node (endpoint nodes; Fig 3A) or three edges per node (intersection node; Fig 3B). In the case of the number of edges per node being one, a one-hop egonet derived from an endpoint node with one neighbor is a subgraph including two nodes, which are the endpoint node and its neighbor; the number of egonet neighbors was concentrated at two for the low-looping veins with many endpoint nodes (Fig 7C Low PC1 of cleared leaf veins, S4A Fig). For veins with three edges per node, the one-hop egonet is expected to possess two, four, or six neighbors. In our observation, the six neighbor nodes frequently appeared (Fig 7C High PC1 of cleared leaf veins, S4B Fig). Thus, the distribution of neighbors of the egonet, especially the mean and skewness, could effectively characterize vein loopiness (Fig 7C).

The PC2 scores of the cleared samples represent the degree of noise in the images. Some false-positive vein regions (i.e., artifacts recognized as veins in the segmentation step) generated dense networks after skeletonization (S2 Fig). The dense network was characterized by features associated with clustering coefficients.

In the PC1–PC3 space of both the cleared and untreated leaf samples, the network-based feature data showed U-shaped distributions (Fig 7A and 7B). For the untreated specimens, the kurtosis of all nodes positively contributed to PC3; for the cleared specimens, the standard deviation of the number of neighbors of the egonet contributed to PC3. Our results showed that the tree-like and looping structures in the vein networks corresponded to the concentrated distributions (i.e., low standard deviation and high kurtosis) around two and six neighbors of the one-hop egonets, respectively, while intermediate structures exhibited broad distributions across both (i.e., high standard deviation and low kurtosis). For the cleared specimens with more loops (i.e., the number of neighbors of the egonet was concentrated at six) or more endpoint nodes (i.e., concentrated at two), the PC3 value tended to be higher. In contrast, the PC3 value was low for vein networks that contained both intersection and endpoint nodes. Therefore, the one-dimensional U-shaped latent space in PC1–PC3 could be explained by the relationship between the degree of vein loops and the distribution of node features.

[END]
---
[1] Url: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1010581

Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.

via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/