A big data approach for detecting macroscale vegetation patterns

The post provided by Buntarou Kusumoto & Yasuhiro Kubota

Vegetation survey on a computer. Credit: Buntarou Kusumoto.

This Behind the paper post refers to the article Community dissimilarity of angiosperm trees reveals deep-time diversification across tropical and temperate forests by Kusumoto et al. published in the Journal of Vegetation Science (https://doi.org/10.1111/jvs.13017).

Field survey is the fundamental approach to know vegetation, and in general, vegetation scientists love to do it. So do we. Vegetation scientists could also collect the information in another way: i.e. digging, rebuilding, and reusing the past data accumulated globally. This may not be as fun as fieldworks but is effective, particularly in macroscale studies.

We are a team of field ecologists, Biodiversity and Conservation Biogeography Japan (https://bcb-japan.weebly.com/), University of the Ryukyus, Japan. All members are originally “pure” field workers sharing the consciousness on the hierarchical processes from macro to local, and from past to present, to better understand local community assembly mechanisms.

In April 2012, we started to collect the published papers containing the information about community structure for plants and other organisms using online search engines (e.g. Web of Science). It was easier to say than done. We retrieved 92,678 papers, manually checked their content, and sorted them according to whether the paper contains available information or not. We worked on it for half of a year, didn’t know how many times we clicked the mouse, but anyway, done. After another half a year, in March 2013, we finally almost finished the data entry, cleaning, georeferencing, and standardizing taxonomy, and then started analyses. There was a sense of accomplishment in different ways from fieldworks.

Using this dataset, we wrote several papers: two of them had been published in the Journal of Vegetation Science (Ulrich et al. 2016; Kubota et al. 2018). In Ulrich et al. (2016), we focused on the shape of species abundance distribution (SAD). We found that more even and log-series type SADs are dominant in lower latitudes (tropical regions) than higher latitudes. This suggests that tree communities of tropical regions are more open and input-driven in comparison to those of temperate regions. In Kubota et al. (2018), we focused on the phylogenetic structure of tree communities, and found region-specific patterns in the phylogenetic community structure (phylogenetic clustering/over-dispersion patterns) and phylogenetic beta diversity among communities (see our video abstract at https://youtu.be/vSum-t8AxIo). We concluded that climatic filtering played an important role in sorting species from the global species pool, while geographical filtering (or dispersal limitation) shaped region-specific diversity patterns.

In our new JVS paper (Kusumoto et al. 2021), we analyzed the relationships between the compositional dissimilarity of angiosperm trees and spatial/environmental distances in the seven biogeographical regions: South American, African, Indo-Pacific, Australian, North American, West Eurasian, and East Eurasian. In the analysis, we also reorganized the dataset based on the plant taxonomic ranks: species, genus, family, and order. In all regions, species-level turnover was the dominant driver of the dissimilarity patterns, which are rapidly saturated along the spatial distance. The shape of the dissimilarity-to-distance curves was dependent on the biogeographical regions and the taxonomic ranks: more rapid saturation of turnover occurred in tropical than in temperate regions, nestedness components have relatively high importance in temperate regions, and dissimilarity-to-distance curves were flatter in the higher taxonomic ranks (family and order). Our results suggested the region-specific differential imprints of historical diversification over deep evolutionary time in shaping extant diversity patterns.

We believe a big data approach based on past data is an effective way to detect ecological patterns at an unprecedented scale, and to infer ecological mechanisms along with historical imprints. We will continue to digitize the past on the PC by digging historical vegetation sampling records in tandem with enjoying vegetation research in the real field.


  • Kubota, Y., Kusumoto, B., Shiono, T., & Ulrich, W. (2018) Environmental filters shaping angiosperm tree assembly along climatic and geographic gradients. Journal of Vegetation Science, 29, 607-618. https://doi.org/10.1111/jvs.12648
  • Kusumoto, B., Kubota, Y., Baselga, A., Gómez‐Rodríguez, C., Matthews, T. J., Murphy, D. J., & Shiono, T. (2021) Community dissimilarity of angiosperm trees reveals deep‐time diversification across tropical and temperate forests. Journal of Vegetation Science, 32, e13017. https://doi.org/10.1111/jvs.13017
  • Ulrich, W., Kusumoto, B., Shiono, T., & Kubota, Y. (2016) Climatic and geographic correlates of global forest tree species–abundance distributions and community evenness. Journal of Vegetation Science, 27, 295-305. https://doi.org/10.1111/jvs.12346

Brief personal summary: Buntarou Kusumoto and Yasuhiro Kubota are Japanese ecologists who are interested in hierarchical mechanisms of biodiversity patterns from local to global scales, and their effective conservation planning based on scientific evidence.