TITLE: Gathering data on plant growth form for a regional species
checklist
DATE: 2025-04-26
AUTHOR: John L. Godlee
====================================================================
A colleague had a list of plant species names from a regional
checklist they have compiled. They wanted to add a description of
the growth form to each species, but with over 2000 species it was
becoming laborious to look up each species individually online.
This was my email response:
There are obviously many different data sources you could use to
get information on growth form, but many of them will be
incomplete, and they will vary in how easy it is to process the
data. Two of the best in terms of coverage and systematic recording
of growth form are probably World Flora Online (WFO,
https://www.worldfloraonline.org/) and the TRY traits database
(
https://www.try-db.org/TryWeb/Home.php).
WFO has growth form information on their website for some species,
but I have been unable to find a way to scrape this information.
They have an API, but it only returns taxonomic information. If you
were to use this method you would have to search for each species
individually and copy the data from the table. Realistically it
might only take a couple of days, maybe you could enlist the help
of some eager Masters students?! It might be possible to scrape the
Data table from each of the species, but when I tried this I kept
getting 403 denied errors.
TRY has growth form information for many species. You can Download
the "Plant Growth Form" data (trait ID 42) from their website. You
have to submit a data request, but it's fairly quick to do and they
are approved automatically after a waiting period of a few hours so
long as you only use their public dataset. Alternatively or in
addition to this, you could look at their categorical traits table,
which is a snapshot from the original 2012 debut of the database.
One key consideration is aligning the taxonomic names in your
species list with those in whatever growth form data source you end
up using. I would recommend using the WorldFlora R package to do
this. I have attached an R script (below) which shows how to do
this.
# Packages
library(dplyr)
library(readxl)
library(WorldFlora)
# Import data
x <- read_excel("./species.xlsx")
# Get first two words
x_clean$species_sanit <-
unlist(lapply(strsplit(x_clean$species_ws, " "), function(y) {
paste(na.omit(y[1:2]), collapse = " ")
}))
# Find duplicated species names
# These species have different authorities but the same name.
# For WorldFlora I will only use the species name, without the
authority
stopifnot(all(!duplicated(x_clean$species)))
x_clean$species_sanit[duplicated(x_clean$species_sanit)]
# Extract unique species names
x_un <- unique(x_clean$species_sanit)
# Download WFO (WorldFlora Online) back-bone data
WFO.download(save.dir = "./dat", WFO.remember = FALSE)
# Load WFO back-bone data from downloaded file
WFO.remember(WFO.file = "./dat/wfo/classification.csv")
# Run species names through WFO matching function
x_wfo <- WFO.match(x_un, WFO.data = WFO.data, Fuzzy = 0)
# Keep only unique species names
x_wfo_clean <- x_wfo %>%
dplyr::select(
species_orig = spec.name.ORIG,
species_wfo = scientificName) %>%
distinct() %>%
filter(!is.na(species_wfo))
# Import TRY categorical database
# You can get this file from try-db.org in their categorical
datasets page.
# You could supplement this with the data you request from the
current database.
try_db <-
read_excel("./dat/Try2025426112154TRY_Categorical_Traits_Lookup_Tabl
e_2012_03_17_TestRelease/TRY_Categorical_Traits_Lookup_Table_2012_03
_17_TestRelease.xlsx")
# Run TRY species names through WFO matching function
# Warning, this can take a while
try_wfo <- WFO.match(try_db$AccSpeciesName,
WFO.data = WFO.data, Fuzzy = 0)
# Keep only unique species names
try_wfo_clean <- try_wfo %>%
dplyr::select(
species_orig = spec.name.ORIG,
species_wfo = scientificName) %>%
distinct()
# Add WFO matched species names to TRY data
try_db_wfo <- left_join(try_db, try_wfo_clean,
by = c("AccSpeciesName" = "species_orig")) %>%
rename(species_orig = AccSpeciesName)
# Filter TRY data to species in Tchamba's species list
try_db_wfo_fil <- try_db_wfo %>%
filter(species_wfo %in% x_wfo_clean$species_wfo) %>%
dplyr::select(species_orig, PlantGrowthForm) %>%
distinct()
# Join growth form and taxonomic names back to original data
table
out <- left_join(x_clean, try_db_wfo_fil,
by = c("species_sanit" = "species_orig")) %>%
dplyr::select(-species_ws, -species_sanit)
# Write table to CSV
write.csv(out, "./out.csv", row.names = FALSE)