Late Quaternary fluctuation in upper range limit of trees shapes endemic flora diversity on the Tibetan Plateau


Application of multi-source satellite data in mapping upper range limit of trees

To map the spatial distribution of the upper range limit of trees over the Tibetan Plateau, we first developed a fusion method to map forested extent for the year 2020 (30 m spatial resolution) by integrating multiple-data sources including optical, microwave, and LiDAR data. Forested regions were defined as regions with tree canopy cover larger than 10% because the extracted upper range limit of trees using this value had the highest correlation with in situ observations12.

Generating the spatial distribution of forest extent over the Tibetan Plateau

Existing publicly-accessible datasets on forest extent systematically underestimate the presence of sparse tree cover76, leading to a direct underestimation of the upper range limit of trees. As such, we have constructed a tripartite upscaling method that integrates in situ identification of landcover types, visual interpretations from Google Earth’s high-resolution (<5 m) images, and data from Landsat multi-spectral images, Sentinel-1A microwave data, and GEDI-derived canopy height. These data sources were integrated using a Convolutional Neural Networks (CNN)77 classification algorithm to produce a high-precision map of forest extent for the year 2020 at a 30 m resolution.

We used space-borne LiDAR measurements of canopy height over 39,675,123 valid laser shots covering a resolution of 25 m on the Tibetan Plateau, collected by the GEDI instrument aboard the International Space Station78. We first randomly selected 47,658 sites and visually interpreted their corresponding vegetation classes using high-resolution (<5 m) Google Earth images produced between May and September 2020 and finally identified a total of 26,541 forested sites. In addition, we also surveyed 346 forested sites in the field during the autumn of 2020 and measured their tree heights.

We then built a CNN classification model linking the landcover types (presence of forest) to structural (GEDI-derived tree height), texture79 (Sentinel-1A microwave-derived co-polarization and cross-polarization data), and spectral attributes80 (Normalized Difference Vegetation Index, and seven spectral reflectance bands from Landsat images) during the period from June to September across GEDI sites. We retained 10% of the data (4765 sites) for independent validation of our constructed CNN model (R2 = 0.89, P < 0.01). We then use the remote sensing data (except GEDI data) as predictors in the CNN classification model to generate a high-resolution (30 m) map of forests for the year 2020.

Generating spatial distribution of upper range limit of trees over the Tibetan Plateau

We then used the developed forest extent map as the input for forest edge detection based on the Canny algorithm81, which extracted the forest boundary by finding abrupt changes in the forest image. We removed the lower forest boundary and only maintained the upper forest edges by defining a local adaptive elevation threshold, below which pixels were removed following Wang et al.12. Specifically, the process begins with constructing a histogram of forest edge elevation within a specified window size centered on a geographic point. The choice of window size is based on empirical analysis (ranges from 0.1 to 1°), balancing the inclusivity of upper and lower edges. We then assessed the unimodality of the distribution within the window size by smoothing the histogram using a Savitzky-Golay filter and determining the number of Gaussian functions in the data. If the number of Gaussian functions exceeds one, we found a threshold to separate the upper and lower ones, based on the position that contains a 99% signal in a Gaussian distribution. Subsequently, we eliminated the lower forest edge and iterated through these steps until a unimodal distribution was achieved (i.e., the number of Gaussian functions equals one).

Validating satellite-derived upper range limit of trees

We validate the satellite-derived upper range limit elevation of trees using manually interpreted samples from high-resolution (HR) (<5 m) satellite imagery provided by the Google Earth platform. To create an equal probability sample, the number of sample points is directly related to the number of pixels that have their upper range limit tree observed by the satellite. We generated a geographical 1 km grid over the Tibetan Plateau, and then randomly selected ten validation samples at a 30 m resolution for each 1 km grid to ensure even coverage across the region. Six well-trained workers visually interpreted and cross-validated all samples using Google Earth and the ArcGIS platform. Furthermore, a single quality controller checked all the results. We therefore obtained 602,879 validation samples evenly distributed across the Tibetan Plateau, with 30,746 (5.1%), 510,035 (84.6%), and 62,098 (10.3%) distributed in the northern, southern, and inner regions, respectively. Validation shows that the satellite-derived range limit has a high degree of consistency with manual interpretations from Google Earth images (R2 = 0.98, Slope = 0.98, ME = 17 m; Fig. 1b).

Identification of drivers of current spatial patterns in the upper range limit of trees

Not all upper range limits of trees reached the thermal treeline position determined by temperature12,21,22. In reality, the upper range limit of trees is often lower than the thermal treeline position due to environmental stresses12,41 (e.g., critical water shortage) and disturbances12 (e.g., human activities). We first compared the satellite-derived upper range limit of trees to the thermal treeline position, and then evaluated which factors were responsible for the deviation of satellite-derived range limit from thermal treeline positions.

To do so, we first calculated the thermal treeline elevation using growing-season temperature at a spatial resolution of 1 km. The thermal treeline is the cold edge of the fundamental niche of trees21, and was calculated as the isotherm of the mean growing-season air temperature at 2 m above the ground (Tair) of 6.4 ± 0.7 °C37 using the WorldClim dataset and the mean growing-season land surface temperature (Tsurface; the radiative temperature of the land derived from infrared radiation emitted from the surface) of 7.6 ± 1.0 °C12 using the Terra Moderate Resolution Imaging Spectroradiometer (MODIS). A global meta-analysis of more than 30 thermal treeline sites from different climate zones showed that thermal treeline positions generally occur at a mean growing-season ground temperature of 6.7 °C at 10 cm depth, with a narrow amplitude of 2.2 °C23,37. Since there was a strong correlation12 between growing-season mean Tsurface and ground temperature at a 10 cm depth across these global treeline sites21, the treeline was then estimated to occur at the growing-season mean Tsurface of 7.6 ± 1.0 °C. In addition, a meta-analysis of thermal treeline sites at the global scale also showed that the start and end of growing season was defined as date when the ground temperature at 10 cm depth exceeds 3.2 °C23. Such ground-temperature threshold corresponds to MODIS-derived land surface temperature threshold of 0.7 °C. Therefore, the start and end of the growing season were determined as the date when daily Tair rises above 0.9 °C, the date at which it falls below 0.9 °C, and the daily Tsurface passes 0.7 °C, respectively. Furthermore, the growing season length must be longer than 94 days37.

We then calculated the deviation of satellite-derived upper range limit of tree elevation from thermal treeline elevation as DTreeline (the difference between the edges of the realized and the fundamental niche)12,22. Before calculating DTreeline, we aggregated satellite-derived upper range limit of trees elevation from the original 30 m to 1 km resolution by calculating the mean value in each 1 km grid cell. To understand the drivers of DTreeline, we compiled several variables (N = 19) belonging to the four different categories (climate limitation, disturbance, soil, and topography factors) (Supplementary Table 1), and then used a random forest model to rank the importance of these variables and calculate the partial contribution of each variable on DTreeline. The entire dataset, including cases where DTreeline ≥ 0 (N = 116,337), was used in our analysis. We generated the partial-contribution plot using the “forestFloor” package in R statistical software (http://cran.r-project.org/), to visualize how each independent variable impacts the prediction while controlling changes in all other variables (Supplementary Fig. 1b to t).

Constructing a climate-based predictive model of upper range limit of trees

In constructing the model, we first used a growing-season temperature threshold to represent the thermal treeline position, and then included several variables including environmental stresses and disturbances, to account for the observed deviation of the satellite-derived upper range limit from the treeline position (DTreeline). Thus, the modeled upper range limit of trees is the sum of the thermal treeline position and DTreeline due to environmental stresses and stochastic effects. We separated temperature from other climatic variables when modeling DTreeline, as there was significant covariation between temperature and other climatic variables, such as indicators of critical water shortages (e.g., vapor pressure deficit and cumulative water deficit), both of which are influenced by temperature.

In modeling DTreeline, we employed a recursive feature elimination method to identify the most relevant variables12. The error decreased when using the six most important variables in determining the spatial variation of DTreeline (Supplementary Fig. 11) but decreased by less than 1% when additional variables were included. We then constructed a parsimonious climate-DTreeline model by selecting the following six variables: spring cumulative climatic water deficit (CWD), spring vapor pressure deficit (VPD), cloud cover, minimum temperature of the coldest month, summer dehydration, and surface curvature (Supplementary Fig. 1 and Supplementary Table 1). The mechanisms for the potential impact of these variables on tree growth and establishment at thermal treeline positions were formulated as follows. For example, the water availability (represented by CWD and VPD) could constrain tree growth and become decisive for seedling establishment, therefore affecting the tree establishment at the thermal treeline position. The CWD, which is calculated as potential evapotranspiration minus precipitation, indicates the quantity of water needed to meet evaporative demand and measures the drought stress on plants, and VPD, as a function of relative humidity and temperature, represents an independent moisture stress on plant photosynthesis. Furthermore, summer dehydration, which is calculated as a product of wind speed, temperature, and relative humidity82, generally occurs during daylight hours under conditions characterized by high temperatures, low relative humidity and strong winds, thereby leading to trees transpiring moisture at a faster rate than their roots take it up from the soil. Such summer dehydration could further aggravate the moisture stress (CWD and VPD) on tree growth and tree establishment at thermal treeline positions. The relationship between cloud cover and tree growth at thermal treeline position is complex. The high cloud cover could reduce the solar radiation, and potentially undermine the plant photosynthesis and tree growth at thermal treeline positions. While the cloudy days are generally associated with the relatively high-water availability and reduced transpiration during the growing season; and this could reduce the radiative cooling and trap heat during the night, potentially allowing trees to survive extreme cold temperatures at the relatively high elevation24. The minimum temperature of the coldest month is used as the freezing stressor that could directly control the absence of trees from the thermal treeline position21. In addition to these macroclimatic drivers, we also include local topographical factors (e.g., surface curvature) as a surrogate of microclimate conditions, because of the potential deviation of microclimate from macroclimate at the high elevation. After including these variables, the parsimonious model achieved a high prediction power (R2 = 0.84, P < 0.01; Supplementary Fig. 2), and is less susceptible to errors in model extrapolation than full models considering all potential drivers. We then applied our parsimonious climate-DTreeline model to predict changes in DTreeline at a spatial resolution of ~1 km at 100-year intervals from LGM to the end of this century (mean over the period 2080–2099). All of these predicators from LGM to the end of this century were derived from CHELSA-TraCE21k, WorldClim, and CMIP6 model projections (see “Assembling climate variables”).

Reconstruction and prediction of changes in the upper range limit of trees

We reconstructed spatiotemporal dynamics of the upper range limit of trees spanning from 22 kyr BP to the end of this century at a spatial resolution of 1 km for every 100 years. Specifically, we determined the thermal treeline elevation using the Tair of 6.4  ±  0.7 °C37 and the Tsurface of 7.6  ±  1.0 °C12, with the growing season length longer than 94 days37, and then predicted changes in DTreeline due to regional environmental stresses using the parsimonious DTreeline model.

Assembling climate variables

Climate data from 22 kyr BP to 2100 CE was compiled from three different data sources: (1) monthly climate data from CHELSA-TraCE21k (Climatologies at High resolution for the Earth’s Land Surface Areas) at a temporal and spatial resolution of 100 years and 1 km, respectively, for the last 22,000 years53; (2) current climatology (2020 CE) from the WorldClim dataset83; and (3) future projections from 9 Earth system models at the end of this century (mean over 2080–2099) participating in CMIP6 under low (SSP1–2.6), intermediate (SSP2–4.5) and high emission scenarios (SSP5–8.5) (Supplementary Table 3). Here, SSPxy represents an integrated scenario of future climate and societal change, with a peak in radiative forcing at y W m−2 before 2100.

We used monthly CHELSA-TraCE21k at a spatial resolution of ~1 km and a temporal resolution of 100 years. This CHELSA-TraCE21k dataset was downscaled from the transient climate simulations (CCSM3 TraCE-21k) from the LGM to present day using the CHELSA V1.2 algorithm53. CHELSA is a high-resolution climate data set for the time period 1979–201384, which uses statistical downscaling of the ERA-interim reanalysis with a bias correction of monthly precipitation from Global Precipitation Climatology Centre. The downscaled paleoclimate data have included the impact of paleo-orography on temperature and precipitation fields given that topography could have changed drastically due to the ice retreat along poles and high mountain areas through time85. The CHELSA TraCE21k was shown to produce a reasonable representation of high-spatial-resolution temperature and precipitation through time, and simulations using this dataset could be capable of detecting effective LGM species refugia53. Additionally, we analyzed changes in orography since LGM and found that a decrease in elevation is mainly located in the river valleys of the southeast Tibetan Plateau and over Hengduan mountain ranges (Supplementary Fig. 12). By contrast, orography changes within the 1 km buffer around the current range limit of trees are almost negligible.

Note that we used WorldClim data to map the present-day upper range limit of trees, and estimated the paleo-range limit by combining the current climatology from WorldClim with paleo-climatic changes derived from CHELSA. Furthermore, we generated future climate data at a spatial resolution of ~1 km by downscaling and bias-correcting CMIP6 model simulations using WorldClim data. Therefore, discrepancies among different data sources to estimate dynamics in the upper range limit of trees could be well resolved through bias correction.

Validating changes in the upper range limit of trees using fossil pollen data analysis at the millennium scale

We then used fossil pollen assemblages preserved in lake sediments over the Tibetan Plateau to determine if the trees were present in the vicinity of the sampling site throughout the Holocene. By incorporating data from an elevational transect, this approach can be used to delineate the upper range limit of trees and their elevational changes over time. We compiled a list of taxonomically harmonized and temporally standardized fossil pollen data from lake sediments over the Tibetan Plateau86 (Supplementary Fig. 7). We only included fossil pollen abundance data that falls within a 500 km radius of the present forest communities, to minimize the risk of misinterpreting the occurrence of trees due to long-distance transport of exogenous arboreal pollen into lake sediments87. This filtering resulted in a total of six lakes distributed across a range of elevation from 3700 to 4500 m26,35,54,55,56,57 that had data for at least 500 years (Supplementary Table 4).

We concentrated on the four most common high-elevational arboreal tree taxa—Betula, Abies, Picea, and Pinus. These tree species have different production and dispersal strategies, which could greatly affect the representation and abundance of tree taxa in fossil pollen records88,89,90. For example, Betula and Pinus have high pollen production rates, good wind transportability and good preservation characteristics88,91,92,93. Therefore, even a relatively high percentage at a given elevation might report a false presence. Since there were no available estimates of pollen production for the main tree taxa on the eastern Tibetan Plateau, we then used the reliable threshold of pollen abundance to determine if one of the tree taxa was present in the vicinity of the sampling site. Specifically, the pollen abundance thresholds with different probabilities were determined by performing a logistic regression using pollen abundance (0–100%) as the predictor and taxon classification (0 and 1 represent negative and positive classes, respectively) as the response variable using 2434 modern pollen samples of the 14 key arboreal taxa from China and satellite-based observation and the Vegetation Atlas of China of modern trees taxa distribution89. We used the pollen abundance threshold at a probability of 0.9 to determine the presence and absence of trees. The thresholds for Betula, Abies, Picea, and Pinus are 16%, 5%, 9%, and 37%, respectively89. Upon identifying any one of the four tree taxa, we assumed that the trees were present in the vicinity of the sampling site. Not all pollen grains will be equally well preserved throughout the time due to their differing sensitivities to environmental changes94,95, and thus, some tree taxa with very low pollen production rates and/or poor preservation could be missed. However, the probability of such underrepresentation is unlikely in this study. For example, common tree species including Corylus, Nitraria, Tamarix, Cyperaceae, Poaceae, and Fabaceae, which were widely reported to be greatly under-represented in China based on 898 modern pollen sampling sites and 2115 pollen data96, were not found at the current realized range limit of trees over the Tibetan Plateau.

Quantifying beta-diversity of endemic alpine species and its drivers

We aimed to estimate the role of paleo fluctuations in the upper range limit of trees on the spatial pattern of beta-diversity. We initially obtained geographical coordinates and elevational ranges for endemic alpine species at the county level, using the data from the Flora of China97 and the Flora of Tibet (Supplementary Data 1). The selection of endemic species is based on the Chinese Endemic Species list, and our focus was primarily on alpine species whose uppermost distribution surpassed the current upper range limit of trees. In this way, we have identified 2111 endemic alpine species in over 100 counties of the Tibetan Plateau.

First, we calculated the β-diversity of endemic alpine species using indices from the Sørensen-based multiple-county dissimilarity measures59 (Supplementary Fig. 8a). The Sørensen-based multiple-county dissimilarity indices were calculated using the following Eq. (1),

$$\beta -diversity=\frac{a+b}{2\times [{\sum }_{i}{S}_{i}-{S}_{T}]+a+b}$$

(1)

$$a={\sum }_{i < j}\min ({b}_{ij},{b}_{ji})\, and\,b={\sum }_{i < j}\max ({b}_{ij},{b}_{ji})$$

For a given focal county \(i\) with N neighbors, \({S}_{i}\) is defined as the total number of endemic alpine species in \(i\) (alpha diversity from endemic species) and \({S}_{T}\) is the total number of endemic alpine species in all neighbors (gamma diversity). \({b}_{ij},{b}_{ji}\) are the number of endemic alpine species only present in the county \(i\) and \(j\), respectively. We calculated β-diversity for all 100 counties selected.

We further partitioned β-diversity into two separate and antithetic components: dissimilarity due to species replacement and dissimilarity due to nestedness using Eqs. 2 and 359. The species replacement component refers to the replacement of some species with others among counties98, and there is no difference in species richness among counties. On the other hand, the nestedness component reflects differences in species richness, when species-poor assemblages are nested in species-rich assemblages. Different counties would then have different species compositions in the absence of species replacement.

$$species\,replacement=\frac{a}{[{\sum }_{i}{S}_{i}-{S}_{T}]+a}$$

(2)

$$nestedness=\frac{b-a}{2\times [{\sum }_{i}{S}_{i}-{S}_{T}]+a+b}\times \frac{{\sum }_{i}{S}_{i}-{S}_{T}}{[{\sum }_{i}{S}_{i}-{S}_{T}]+a}$$

(3)

The accuracy of the Sørensen index to estimate beta-diversity was recently suggested to be heavily dependent on species prevalence, with high species prevalence generally yielding misleading inferences about species associations99. We then quantified the extent to which species prevalence affected the Sørensen index in estimating β-diversity on the Tibetan Plateau, using a statistical mathematical metric generally used for evaluating pairwise similarities instead of assessing similarities among multiple (>2) species/sites. The prevalence of pair-wise species is estimated by calculating the log odds ratio of the conditional occurrence probability (the occurrence rate of pair-wise species in total counties)99. The endemic alpine species on the Tibetan Plateau have a generally low prevalence of 0.04 ± 0.05 (Supplementary Fig. 13a), and thus, it is unlikely that pairwise species would invalidate the interpretations of the traditional Sørensen-based metrics of species co-occurrence. This is confirmed by our analysis showing that there is a good correspondence (positive correlation) between the affinity metric of co-occurrence (Alpha MLE) that is not affected by species prevalence99 and the most popular co-occurrence metrics such as those due to Jaccard (R = 0.75, P < 0.01), Sørensen-Dice (R = 0.75, P < 0.01), and Simpson (R = 0.83, P < 0.01) (Supplementary Fig. 13b–d). We thus confirmed the robustness of using Sørensen-based multiple-county dissimilarity measures to estimate β-diversity.

Second, in terms of paleo-temporal fluctuations in the upper range limit of trees and climate drivers, we only retained the high-frequency component of the paleo time series. Specifically, we decomposed the original time series into a finite set of oscillatory components called Intrinsic Mode Functions (IMFs) and a residual trend component, using Empirical Mode Decomposition method100. IMF1 stands for the first IMF obtained during the decomposition process, which is the dominant oscillatory component extracted from the original time series of each variable. The standard deviation of the IMF1 for each county was used to represent paleo temporal fluctuations in the upper range limit of trees and climatic drivers (growing-season temperature and precipitation). In addition, to assess whether warming-induced upslope tree expansion would potentially lead to habitat loss, we also calculated the alpine land area by summing the surface area above the mean elevation of the reconstructed upper range limit of trees in each county, using a 30-m spatial resolution digital elevation model101. For each mountain defined in the GMBA mountain inventory v2.0 dataset102, we created a 100 km radius buffer around a target upper range limit pixel, and then summed up the surface area of all 30 m resolution pixels with elevations higher than that of the target pixel.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.



Source link

More From Forest Beat

Airborne imaging spectroscopy surveys of Arctic and boreal Alaska and northwestern...

Miller, C. E. et al. The ABoVE L-band and P-band airborne synthetic aperture radar surveys, Earth Syst. Sci. Data 16, 2605–2624, https://doi.org/10.5194/essd-16-2605-2024 (2024).Article  ...
Biodiversity
8
minutes

Snow Leopard habitat vulnerability assessment under climate change and connectivity corridor...

Thomas, C. D. et al. Extinction risk from climate change. Nature 427, 145–148 (2004).Article  ADS  CAS  ...
Biodiversity
11
minutes

Species responses to weather anomalies depend on local adaptation and range...

Degree of local adaptationWe used count data from 34 butterfly species whose populations have been previously seen to show a clear response to...
Biodiversity
11
minutes

Ambitious changes to Canadian conservation law are needed to reverse the...

Canada’s biodiversity is in decline. Globally, climate change, urbanization, overexploitation of resources and habitat loss are combining to drive...
Biodiversity
4
minutes
spot_imgspot_img