Global divergence in plant and mycorrhizal fungal diversity hotspots


Alpha diversity data layers

We used previously published global predictions of AM and ECM fungal richness from Van Nuland et al.26 and Mikryukov et al.29, and vascular plant richness from Sabatini et al.28 and Cai et al.27. Both of the fungal richness studies generated predictions at 30 arc-s resolution (~1 km2 at the equator). Van Nuland et al.26 used random forest models trained on ITS sequences for ECM fungi and SSU sequences for AM fungi from the GlobalFungi and GlobalAMFungi database52,54. Mikryukov et al.29 used boosted regression trees with sequence data from the Global Soil Mycobiome consortium (https://gsmc-fungi.github.io/). For vascular plants, Sabatini et al.28 modelled forest and non-forest vascular plant richness separately at 2.5 arc-min resolution (~4.63 km2 at the equator) using boosted regression trees with data from the vegetation plot database ‘sPlot’ from multiple plot sizes ranging between 10 m2 and 10,000 m2 (www.idiv.de/splot). We selected the joined forest and non-forest prediction that was based on data from plot sizes of 1000 m2. Cai et al.27 built vascular plant richness models using the regional checklists of native vascular plants in the GIFT database55 as training data and five modelling methods to generate consensus predictions for 7774 km2 hexagons (generalised linear models, generalised additive models (GAMs), random forest models, extreme gradient boosting, and neural networks). We selected the interpolated ensemble prediction raster (average of the five modelling methods weighted by model accuracy). Both vascular plant layers were provided by the authors resampled at 30 arc-s resolution. We reprojected all layers to Equal Earth projection at 1 km2 resolution using the ‘terra’ package version 1.7-7856 in R version 4.4.157.

Modelled geospatial layer predictions contain uncertainty from factors such as training data sampling bias, model misspecification, and error propagation through predictive covariates26,31. Therefore, we created uncertainty masks to exclude grid cells with either high model uncertainty in the original richness layers (as calculated by the original authors) or strong disagreement between the predictions produced by different studies. Uncertainty was measured differently across studies. Van Nuland et al.26 quantified fungal richness uncertainty as the coefficient of variation (CV; the standard deviation divided by the mean) between 100 random bootstrap replicates (stratified across biomes), while Mikryukov et al.29 calculated uncertainty as the standard deviation (SD) from 10-fold cross-validation. For plant diversity by Sabatini et al.28, uncertainty was the percentage ratio between the interquartile distance (IQR) and the median of species richness estimates across 99 stratified resampled subsets of the data (using stratum based on unique combinations of realms, biomes, and plot sizes). For Cai et al.27, uncertainty was the CV of predictions produced by the five different modelling methods. For all layers, we created masks to exclude grid cells with uncertainty values in the top 95th percentile (using the ‘quantile’ function from the ‘terra’ R package; see Supporting Information Table S3 for the 95th percentile thresholds for each layer).

We generated consensus richness predictions for the two layers in each taxonomic group by (1) log-transforming the richness values of rasters that had not previously been transformed after adding 1 as a constant (which is consistent with the approach used by Mikryukov et al.29), (2) standardising (i.e. centreing and scaling) each raster layer to have zero mean and unit standard deviation, and (3) calculating the mean value for each grid cell. We also calculated the CV across the two raster layers within each taxa for each grid cell (i.e. in the case of two raster layers, the CV is half the absolute difference divided by the mean). We then excluded regions with CV values in the top 95th percentile (i.e. those with predictions that strongly disagreed between the studies). Standardising the richness layers meant that the mean contained negative values which prevents calculation of the CV, and so we added a constant to both layers (the minimum value across both) to make all values positive prior to calculating the CV. Whilst the size of the constant can influence the magnitude of the CV, this is not the case when only two values are used in the CV calculation. Additionally, the rank order of CV values remains the same regardless of the constant, and this information was what was required to exclude the top 95th percentile. We chose to use the CV rather than the SD or range difference to ensure that we did not bias exclusion towards grid cells with higher mean richness values that may contain a greater SD. Grid cells excluded were those with CV values > 0.15 for vascular plants, >0.51 for AM fungi, and >0.24 for ECM fungi (Supporting Information Table S3). This means that no grid cells were included where the standard deviation of the two layers was more than 51% of the mean.

The relevant uncertainty masks were applied to all richness layers prior to extracting data and conducting the analyses described below. For plant–AM fungal comparisons, the masks included the two plant richness uncertainty masks (from Sabatini et al.28 and Cai et al.27), the two AM fungal uncertainty masks (from Van Nuland et al.26 and Mikryukov et al.29), regions where the two plant layers disagreed, and regions where the two AM fungal layers disagreed. Richness layers used in plant–ECM comparisons were masked in the same way but with the ECM-relevant alternatives. Therefore, plant richness layers were masked differently depending on if they were involved in AM or ECM fungal comparisons, and so statistics calculated from extracted data differ slightly. A total of 67.8% and 68.6% of available global terrestrial grid cells remained included after applying uncertainty masks for AM and ECM fungal analyses, respectively (Supporting Information Table S4). At least 50% of grid cells were retained within most biomes (Table S4), except for Montane Grasslands (43.7% for AM analyses and 41.1% for ECM analyses) and Deserts (21.1% for AM and 32.7% for ECM) (Supporting Information Table S4). After masking, Spearman correlations between data from the two geospatial layers of each taxon ranged between 0.45 and 0.68 (Supporting Information Fig. S13).

Richness correlations

We used Spearman rank correlation coefficients (to allow for non-linearity in monotonic relationships) to assess alpha diversity correlations between (i) plant and AM fungal richness and (ii) plant and ECM fungal richness. For the main analysis, we calculated correlations using the consensus richness predictions created from averaging the two predictions within each taxonomic group (described above). Analysis of how correlations differed between the individual richness layers is provided in Supporting Information Fig. S14. Correlations were calculated at the global, biome and ecoregion levels15 and based on richness values from 10,000 random grid cells selected across the globe, 10,000 cells selected within each of the 14 biomes, and up to 1000 cells selected within each of the 846 ecoregions using the ‘spatSample’ function from the ‘terra’ R package. Fewer grid cells were used for ecoregions because of their smaller size. For ecoregions with fewer than 1000 grid cells, all grid cells were used, and only ecoregions with at least 100 grid cells were included (766 ecoregions for AM analyses and 770 for ECM analyses).

We also examined plant–fungal richness relationships using GAMs, to allow for more complex non-linear relationships (e.g. quadratic relationships). We used the ‘gam’ function from the ‘mgcv’ R package version 1.9-158, and modelled fungal richness as a function of plant richness (separately for AM and ECM fungal richness) within each ecoregion, using the randomly selected ecoregion grid cells described above. We used a Gaussian distribution, REML as the smoothing parameter estimation method, thin plate regression splines, and three basis functions (k). Other parameters were set at function defaults. We compared the percentage deviance explained by the model for each ecoregion to the absolute value of the Spearman correlation coefficient described in the previous paragraph, by calculating Spearman rank correlation coefficients of the two metrics. The metrics were correlated r = 0.86 for the AM analysis and r = 0.87 for the ECM analysis (Supporting Information Fig. S3).

We examined how richness correlations might be impacted by underlying uncertainty in the original richness geospatial layers to evaluate whether error could lead to spurious correlations being identified. To do this, we used the model uncertainty layers provided by the original authors (those described above that were used to make the uncertainty masks) to evaluate how correlations change if richness values were to fall anywhere within of SD of the original model bootstrap replicates (or SD of the five model types in the case of Cai et al.27). When the CV was provided as the uncertainty layer, we back-transformed values to obtain the SD by dividing by 100 and multiplying by the mean. Sabatini et al.28 provided uncertainty as the percentage ratio between the IQR and the median. Therefore, we divided by 100, multiplied by the median, and multiplied by the constant 1.35 to approximate the SD. This approximation relies on the bootstrap replicates being normally distributed which we were unable to confirm. However, this approximation was required to make this layer comparable to the others. We also treated the median richness raster from Sabatini et al.28 as equivalent to the mean richness rasters of the other layers. We generated a distribution of possible correlations that could arise given the SD around the mean of each grid cell by (1) taking the 10,000 random grid cells of each of the four richness geospatial layers used to calculate the original correlations and adding a random value (selected from a uniform distribution) between plus and minus the SD of the grid cell, (2) log-transforming the new error-adjusted richness values (except for the Mikryukov, et al.29 layers that were log-transformed prior to calculating the provided mean and SD), (3) standardising the log-transformed error-adjusted values around a mean of 0 and SD of 1 (based on the range of values present in the full raster layer, not just the 10,000 random cells), (4) calculating the average of the adjusted values of the two geospatial layers for each taxonomic group, and (5) calculating Spearman rank correlation coefficients between the adjusted plant and fungal richness values. We repeated these steps for both plant–AM and plant–ECM fungal richness correlations 1000 times and examined the distribution of correlation values returned. We did this both at the global level and within each of the biomes. As expected, results showed that adding error generally weakened correlations, but no iterations produced correlations with the opposite sign to that calculated in the original analysis (Supporting Information Fig. S11).

We also evaluated how the correlations calculated within ecoregions compare to those calculated using ground-sourced data. Paired plant and fungal richness data are rare at the global level, but continental-scale data on vascular plant and ECM fungal richness is available from the US NEON. Unfortunately, we were unable to obtain reliable AM fungal richness data from the same plots. We obtained the processed ECM fungal richness data from Qin et al.59, and matched that with vascular plant data for the same plots downloaded from the ‘plant presence and percent cover’ data product (DP1.10058.001) using the ‘neonUtilities’ R package version 2.4.260. We counted the number of unique taxon IDs present within each plot as the plant species richness. Data were available for 426 plots from 28 ecoregions across the US. However, most ecoregions contained limited replication, and so we calculated Spearman plant–fungal correlations for the 12 ecoregions that contained at least 10 plots. NEON plot plant–fungal richness correlations were positively related to those calculated in this study (Pearson’s r = 0.37, Supporting Information Fig. S12), providing support that our geospatial layers are approximating on-the-ground patterns. However, this relationship was non-significant (linear model: t(1,10) = 1.3, p = 0.23, estimate = 0.20, 95% confidence interval = −0.15–0.55) due to the small number of ecoregions tested.

Mechanisms driving richness correlations

We performed multiple analyses to look for evidence supporting the potential mechanisms driving correlation patterns outlined in Fig. 1. As a measure of host plant prevalence (Fig. 1B), we used geospatial layers of the percentage of aboveground plant biomass belonging to AM or ECM host plants33. For analyses relating to past climate stability, we used the climate stability index24 which maps the variability of bioclimatic variables between the Pliocene (3.3 Ma) and the present. To assess human disturbance effects, we followed Van Nuland et al.26 and summed the geospatial layers from EarthEnv61 of the percentage of each grid cell covered by urban/built-up areas and cultivated/managed vegetation to create a measure of ‘human development percentage’. This metric provides an indication how disturbed the landscape is by both urbanisation and human agricultural/land management activities. Other environmental geospatial layers included mean annual temperature, mean annual precipitation (both from CHELSA62,63), soil pH, soil organic carbon content (SOC), soil nitrogen (all from the upper 5 cm of soil64), Olsen plant-available phosphorous65, and elevation (EarthEnv66). All layers were reprojected to Equal Earth projection at 1 km2 resolution to match the alpha diversity layers.

Symbiosis effects and legacy effects

To test whether plant–fungal symbiosis could be driving correlations, we calculated the percentage of variation in fungal richness that could be directly attributed to changes in plant richness after accounting for variation explained by environmental covariates (as a test of Fig. 1A). We conducted GAMs to account for non-linearity between fungal richness and some environmental variables (using the ‘mgcv’ R package version 1.9-158; Fig. S6). We used values from the 10,000 randomly selected global grid cells, and modelled fungal richness (separately for AM and ECM) as a function of plant richness including temperature, precipitation, pH, SOC, soil nitrogen, soil phosphorus and elevation as covariates. These covariates were chosen due to their expected influence on plant and fungal richness patterns and their availability as global geospatial layers. We used a Gaussian distribution, REML as the smoothing parameter estimation method, thin plate regression splines, and three basis functions (k). Other parameters were set at function defaults. We used the ‘gam.check’ function to check model diagnostic plots and assist with selecting k. Percentage deviance in fungal richness explained by plant richness within each model was calculated as follows based on Brunbjerg et al.67:

$$ {Deviance\; explained} \\ =\frac{{deviance}\left({reduced\; model\; without\; plant\; richness}\right)-{deviance}({full\; model})}{{deviance}({null\; model\; with\; intercept\; only})}$$

(1)

Additionally, we used GAMs to model how correlations within each ecoregion vary according to the prevalence of potential host plants (as a test of Fig. 1B), or legacy effects relating to past climate stability and human disturbance (tests of Fig. 1D, E). Plant–AM fungal and plant–ECM fungal richness correlations at the ecoregion level were used as response variables, and were modelled as a function of aboveground host plant vegetation biomass (AM plant biomass for the AM model and ECM plant biomass for the ECM model), the climate stability index, and the human development percentage. Covariates of temperature, precipitation, pH, SOC, soil nitrogen, soil phosphorus, elevation, raw fungal richness and raw plant richness were also included to account for variation explained by non-symbiosis or non-legacy effects. Covariate variable data were ecoregion means of the same randomly sampled grid cells within each ecoregion (up to 1000) that were used to calculate richness correlations. Correlations between covariates were <|0.64|, except for precipitation and soil pH (r = −0.73), precipitation and plant diversity (r = 0.75), SOC and soil nitrogen (r = 0.76), and ECM fungal richness and ECM host vegetation percentage (r = 0.79). GAM parameters were as described in the previous paragraph, except that four basis functions were used.

Environmental effects

We tested whether the direction of plant–fungal richness correlations (positive or negative) could be influenced by underlying relationships between richness and environmental covariates by examining whether positive plant–fungal correlations are more likely to occur when plant and fungal richness respond similarly to environmental gradients, and negative correlations are more likely when richness responses differ. Within each ecoregion we fitted two linear models — one modelling plant richness and one modelling fungal richness—to extract model estimates of the relationship between richness and environmental gradients. We chose linear models instead of GAMs so that we could extract directional (positive or negative) relationship estimates. We tested mean annual temperature, mean annual precipitation, and soil pH. We did not test SOC, soil nitrogen, soil phosphorus, or elevation due to high collinearity between these and other variables within many ecoregions. Richness of the opposite group (fungal richness for plant models and plant richness for fungal models) and the percentage of plant biomass belonging to potential host vegetation33 were included as covariates to remove variation due to potential host–symbiont coupling effects. Environmental data came from the same randomly selected grid cells (up to 1000) within each ecoregion that were used to calculate plant-fungal richness correlations. In addition to excluding ecoregions with fewer than 100 grid cells, we excluded ecoregions where environmental predictors contained fewer than 10 unique values (700 out of 867 ecoregions remained). We confirmed that response variables met normality assumptions in all ecoregions by flagging potentially skewed richness data using the ‘shapiro.test’ R function from the ‘stats’ package version 4.4.157 and checking histograms. All variables were standardised to range between 0 and 1 within each ecoregion prior to conducting the model so that model estimates were comparable between variables.

We extracted the model coefficient estimates for each ecoregion, and then performed Chi-squared tests (‘stats’ R package version 4.4.1) for each environmental variable to test the hypothesis that ecoregions where plant and fungal richness were related to that variable in the same direction (i.e. coefficient estimates were either both positive or both negative) contained positive plant–fungal correlations, and ecoregions with opposite model estimates contained negative plant–fungal correlations. Only coefficient estimates with p values < 0.05 were included in Chi-squared tests because non-significant estimates were considered unreliable. Some ecoregions had high covariate collinearity; 9.0% (AM models) and 9.1% (ECM models) of ecoregions contained covariates with Pearson’s r correlations > |0.85|, and 20.3% (both AM and ECM) contained correlated covariates of r > |0.8|. We recalculated Chi-squared values after removing these ecoregions, and patterns remained relatively stable (Supporting Information Figs. S15 and S16), indicating covariate collinearity did not strongly influence the overall pattern.

Richness hotspots

Alpha diversity hotspots were defined as the grid cells with richness values greater than the upper 95th percentile, and were calculated separately for the two geospatial layers within each taxonomic group after applying the relevant uncertainty masks (described above). 95th percentile cut-off values were calculated using the ‘quantile’ function from the ‘terra’ R package. We also calculated hotspots using 80th, 85th, 90th, and 98th percentile cut-off values to see how hotspots varied when a different threshold was chosen (Supporting Information Fig. S17). We chose to use the 95th percentile threshold in the main analysis because this follows methods used by past studies26,28,30. There were some differences in the exact grid cells identified as hotspots between the two layers (Fig. 1 and Supporting Information Fig. S7 and Table S2), however they generally reflected the same geographical regions. All grid cells identified by either one or both studies were considered hotspots and were used to assess hotspot overlap between (i) plants and AM fungi and (ii) plants and ECM fungi. We mapped hotspots and calculated hotspot overlap both at the global level and within each of the 14 biomes.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.



Source link

More From Forest Beat

Harnessing eDNA technology to identify fish diversity and distribution in the...

Sequence statistics and fish compositionIn this study, a total of 4,594,782 high-quality sequences (2,457,586 in April and 2,137,196 in September) were obtained after...
Biodiversity
7
minutes

Here’s how you can make your garden a safe and biodiverse...

Biodiversity is essential to mitigating and adapting to climate change, enhancing the resilience of ecosystems and safeguarding the ecological...
Biodiversity
4
minutes

Reconsidering space-for-time substitution in climate change ecology

Laboratory of Tree-Ring Research, University of Arizona, Tucson, AZ, USAMargaret E. K. EvansDepartment of Wildland Resources, Utah State University, Logan, UT, USAPeter B....
Biodiversity
0
minutes

Rethinking composite quantification by capturing biological and ecological diversity across multiple...

We propose new measures to expand the conceptual and methodological framework of life–environment diversity. Our approach combines the variability in community structure with...
Biodiversity
30
minutes
spot_imgspot_img