Terrestrial land cover shapes fish diversity in a major subtropical river catchment


The study was conducted in the Chao Phraya River catchment located in Northern and Central Thailand, covering rivers in both mountainous and plain landscapes. We combined fish diversity data from eDNA sampling in the rivers (elevation ranging from 2 m to 509 m a.s.l.) and a land use and land cover map across the 160,000 km2 catchment (Fig. 1).

Environmental DNA sampling and fish data

Environmental DNA (eDNA) metabarcoding has become a standard approach for aquatic biodiversity sampling and monitoring due to its high efficiency and significant reduction in fieldwork intensity50. A meta-analysis has revealed that there is a very good congruence in gamma diversity between eDNA sampling for fish species and sampling using traditional sampling methods (on average an almost 1:1 match on richness, and a >50% congruence on the specific fish species found by both methods, and an additional ~25% of species found by either eDNA and traditional monitoring only), showing the potential of replacing traditional especially invasive sampling methods25. In this research, the fish data were derived from an eDNA metabarcoding study with methodological details published therein26. Briefly, eDNA sampling was carried out in 2016 at 39 sites in major river channels, during the dry season under base-flow conditions. At each site, six samples were collected from the left bank, channel center, and right bank, respectively (two replicates each; 234 samples in total). Following on-site filtration (600 mL water in total per sampling site), DNA was extracted using standardized methods, and metabarcoding analyses were carried out using two separate molecular assays based on the mitochondrial 12S region, subsequently referred to as the Kelly primers for vertebrates and the MiFish primers for fish27,28. To improve the accuracy of sequence assignment, we created a customized reference library database from GenBank targeting fish species known to occur in the Chao Phraya River catchment, according to OEPP Biodiversity Series Vol. 4 Fishes in Thailand51 and the Checklist of Freshwater Fishes of Thailand (http://www.siamensis.org/). During bioinformatic processing, sequences were assigned to a total of 108 fish taxa (mostly at species level, so referred to as fish species), with 82 and 93 taxa recovered using the Kelly primers and MiFish primers, respectively. Differences in species communities between the two assays can largely be accounted for by unequal representation of the respective DNA regions in the reference database and differences in species-level resolution26. We merged these two datasets by choosing the higher read counts at each site for each fish species and calculated species richness. The list considered included native and naturalized species, part of which were also used in aquaculture. We further matched the detected fish species with the Red List of the International Union for Conservation of Nature (IUCN), having removed any alien or invasive species. In total, seven species were identified as critically endangered (CR), endangered (EN), vulnerable (VU), or near threatened (NT), and were treated as species of conservation concern (SPCC) in our fish data (Table S1).

Land use and land cover (LULC) data

We used the European Space Agency Climate Change Initiative (ESA CCI) land cover map, with a yearly interval (website: https://www.esa-landcover-cci.org/)23. Specifically, we used the 300 m-resolution map from 2016 to temporally match our fish data. Among the 36 classes in the classification system, 22 classes—including croplands, forests, shrublands, grasslands, and urban areas—were observed in the Chao Phraya catchment.

To alleviate uncertainties from rarely observed LULC types, we excluded or merged those LULC types occupying <0.2% of the area. Further, we removed all areas covered by water (lakes, reservoirs, and rivers), so that only terrestrial LULC types were used. We then recoded this LULC map into a five-class-system map, comprised of rainfed cropland, irrigated cropland, forest, shrubland- and grassland, and urban areas (Fig. 1). A detailed recoding table is shown in Table S2.

River channel and catchment data

To improve spatial modeling performance, we adopted the three-arc-second resolution (~92 m at the equator) HydroSHEDS (version 1) flow direction map to calculate potential catchments for sampling sites29. The value of a given pixel in the water flow direction map represents the direction of water flow for eight neighboring pixels (i.e., the eight possible values represent eight directions of water flow). Therefore, for each sampling site, one can track all the pixels in the map and determine if one pixel ultimately flows through the sampling site. The cumulative flow distance from the pixel to the sampling site can also be calculated at the same time. Then, all the pixels that flow through the sampling site are connected as a catchment map for that sampling site.

Following this approach, we produced catchment maps of all sampling sites with flow distances calculated by the haversine formula (see Supplementary Text) to improve the modeling accuracy. Next, the catchment maps with flow distance were resampled to match the LULC data and then were used in the FishDiv-LULC model introduced below as dij (the flow distance between a catchment pixel j and sampling site i). The major river channels (blue lines in Fig. 1) were extracted using a threshold drainage area of ~810 km2 (100,000 pixels). We also extracted the long-time (1970–2000) average river discharge (unit: m3/s) from the HydroSHEDS database, hereafter referred to as river discharge (Q), for all sampling sites and major river channel pixels.

In this study, we chose the ESA CCI LULC data with a resolution of 300 m, but not a map with higher spatial resolution, mainly because of the uncertainties in depicting river channels in this 90 m HydroSHEDS map, even though it was one of the most accurate databases available. We noticed that the meandering river channels in the mid- to lower reaches are not absolutely precise, with spatial uncertainties of up to several pixels, which introduced non-negligible uncertainty into the structure of river networks. For these reasons, we manually checked the location of eDNA sampling sites and tried to minimize the uncertainty in catchment calculations.

Modeling terrestrial LULC effects on fish species richness (FishDiv-LULC model)

We developed a spatially explicit modeling framework to assess terrestrial LULC effects on riverine fish species richness, considering both the spatial range and magnitude of LULC effects52. Briefly, we hypothesize that the fish species richness at a sampling site equals the integral of LULC effects within the site’s catchment (with a distance decay framework) plus a baseline estimation. The following modeling and computation are carried out at the pixel level because of the raster format of data used in this study. We have also specified units for all the parameters in the model so that one can directly perceive the actual meaning behind.

To begin with, let k = 1, 2, …, K represent the LULC type, and j = 1, 2, …, Nik represent the pixel index of LULC type k in the catchment map of sampling site i (i = 1, 2, …, M; M = 39 in this case). Overall, then the observed fish species richness at sampling site i (Bi) on the left-hand side is assumed to be equal to the sum of effects of different LULC types in site i’s catchment (\(\sum {S}_{{{{\rm{ik}}}}}\), unit: num. species) plus a baseline fish species richness prediction Bi0 and an error εi (Eq. 1).

$${B}_{{{{\rm{i}}}}}={\sum }_{{{{\rm{k}}}}=1}^{K}{S}_{{{{\rm{ik}}}}}+{B}_{{{{\rm{i}}}}}^{0}+{\varepsilon }_{{{{\rm{i}}}}}.$$

(1)

Because sites closer to the river have higher effects than sites farther away from the river, we introduce a distance decay framework to characterize this inverse relationship for terrestrial LULC effects over the water flow distance. For each site of interest (e.g., site i), the effect of a pixel j with LULC type k on the fish species richness can be written as Vk ∙ Aj ∙ f(dij). Whereby Vk is the magnitude of the effect of LULC type k (unit: num. species/km2), f(d) is a distance decay function, and Aj is the area of pixel j (unit: km2) depending on the coordinates and is estimated by the haversine formula (see Supplementary Text).

We compared and evaluated five commonly used distance decay functions and chose the widely used exponential decay function in this case because of its simplicity (see Supplementary Text and Table S6). In the exponential distance decay function, distance is directly the flow distance in the catchment map (dij) from pixel j to site i (Eq. 2).

$$f\left({d}_{{{{\rm{ij}}}}}\right)=\frac{3}{r}{e}^{-\frac{{3d}_{{{{\rm{ij}}}}}}{r}}.$$

(2)

Here, the parameter r (unit: km) indicates the effective distance at which the magnitude of terrestrial LULC effect has dropped to ~5% of its original value (a common threshold in spatial statistics). Consequently, the terrestrial LULC effect is explicitly expressed in Eq. (3).

$${S}_{{{{\rm{ik}}}}}=\left\{\begin{array}{cc}\frac{3}{r}\cdot {V}_{{{{\rm{k}}}}}\cdot {\sum }_{{{{\rm{j}}}}=1}^{{N}_{{{{\rm{ik}}}}}}{A}_{{{{\rm{j}}}}}\cdot {e}^{-\frac{{3d}_{{{{\rm{ij}}}}}}{r}},&{{{\rm{if}}}}\,{{LULC}}_{{{{\rm{j}}}}}=k,\\ 0, \hfill &\hfill{{{\rm{if}}}}\,{{LULC}}_{{{{\rm{j}}}}}\, \ne\, k.\hfill\end{array}\right.$$

(3)

Note that \({\sum }_{k=1}^{K}{N}_{{{{\rm{ik}}}}}\) is equal to the total number of pixels in the catchment map of site i (i.e., every pixel belongs to one specific LULC type k). The baseline prediction B0 (unit: num. species) is expressed using river discharge (Q), which best explains fish species richness pattern (Fig. S3, Eq. 4).

$${B}^{0}=a+b\cdot {{{\mathrm{ln}}}}\,Q.$$

(4)

Optimization of model parameters

Model parameters were estimated by solving a maximum likelihood problem, given the observed fish species richness Bi. Subsequently, we write the above optimization problem explicitly as Eq. (5) with vector-matrix notation.

$${{{\boldsymbol{B}}}}{{{\mathscr{ \sim }}}}{{{\mathscr{N}}}}\left( {{{\boldsymbol{\mu}}}} \left({{{\boldsymbol{\theta}}}} \right),{\sigma }^{2}{{{\boldsymbol{I}}}}\right),\,\,{{{\boldsymbol{\mu}}}} \left({{{\boldsymbol{\theta}}}} \right)=a+b\cdot {{{\mathrm{ln}}}}\,{{{\boldsymbol{Q}}}} + {{{\boldsymbol{C}}}}\left(r\right)\cdot {{{\boldsymbol{V}}}},\,{{{\boldsymbol{\theta}}}} =\left(r,\,a,\,b,{{{\boldsymbol{V}}}}\right)$$

(5)

Whereby C(r) is an M-by-K matrix with elements \({c}_{{{{\rm{ik}}}}}=3/r\cdot {\sum }_{{{{\rm{j}}}}=1}^{{N}_{{{{\rm{ik}}}}}}{{A}_{{{{\rm{j}}}}}e}^{-{3d}_{{{{\rm{ij}}}}}/r}\) depending on the distance parameter r; V is a K-vector of magnitude parameters Vk; B is an M-vector of observed fish species richness Bi; a and b are constants, with a being the intercept in the estimation; Q is an M-vector of river discharge values; and θ is a K + 3 dimensional parameter. We assume each element of the error is independent and identically normally distributed (however, for catchments with a small species pool, Poisson or negative binomial distributions may be considered, and it needs a case-by-case analysis).

Then, we estimate θ by the maximum likelihood approach (see Supplementary Text for detailed method). We also computed adjusted R2.

Correlation and significance of model parameters

Variance inflation factor (VIF) and paired Pearson’s correlation of parameters were calculated under the estimated effective spatial range (r = 19 km). The correlation among the estimated terrestrial LULC effects was relatively weak (Table S7 and Fig. S11). The significance of these effects was determined by a likelihood-ratio test (see Supplementary Text).

Validation, robustness, and estimation of uncertainties

We used leave-one-out cross-validation to assess the uncertainty of our parameter estimates. Specifically, we reserved one sampling site from the fish data for testing, followed by estimating the parameters based on the remaining 38 sites (i.e., the training set). Then, we predicted the fish species richness value on the reserved site and compared the predicted value with the real observation. We repeated the whole process for each site and calculated the root-mean-square error (RMSE) of our model to be 8.02 (mean of prediction: 34.9).

The robustness of our model was tested by splitting the sampling sites into mountainous sites (elevation >100 m, 19 sites) and the remaining sites in the plains (20 sites). Then, we fitted the same model to the two data subsets (Table S3).

We also plotted the residuals of the model against terrestrial LULC effects and the model prediction (Fig. S12), and we did not find any obvious trend in these scatter plots, proving the normality assumption in model optimization. In addition, a normal Q-Q plot for the residuals was given to further support this assumption (Fig. S13). To estimate the uncertainty of parameters, we calculated profile likelihood-ratio confidence intervals (CI) of levels of 50% and 90% for each model parameter (see Supplementary Text; Table S8).

Modeling terrestrial LULC effects on fish species distributions (species-level modeling)

To predict the habitat distribution of fish species, we generalized our model to a species-level model by modifying the fish species richness observation of Eq. 5 with a logit function. Specifically, we substitute B with ln(P/1-P), where P is the probability of presence of a fish species in Eq. (6).

$${{{\mathrm{ln}}}}\left(\frac{P}{1-P}\right)=a+b\cdot {{{\mathrm{ln}}}}\,Q+{{{\boldsymbol{C}}}}\, \left(r\right)\cdot {{{\boldsymbol{V}}}}+{{{\boldsymbol{\varepsilon}}}} .$$

(6)

Whereby, ε is an M-vector of errors, in which each element is assumed independent and identically normally distributed in this case. Then, we applied a maximum likelihood estimation to find the effective spatial range r and magnitude V of terrestrial LULC effects. The associated LULC for each species was assigned by the LULC type with the highest terrestrial LULC effect (Vk) value.

Terrestrial LULC effect map

We mapped the LULC effect on fish species richness for each terrestrial pixel (ELULCj) by tracing the flow path of pixel j according to the flow direction map and summing up its LULC effect along the major river channels downstream (see Fig. S7 for a schematic diagram). The flow path of pixel j comprises of two sections: the terrestrial path (Lter,j, unit: km) and the river path (Lj, unit: km); and the integral of VLULCj (derived from V depending on the LULC type of pixel j, unit: num. species / km2) along river path (Lj) starts from where water flow tracing of pixel j entering the major river channels. To simplify computation, the effective distance r (unit: km) was set as the upper boundary of the integral (i.e., r = Lter,j + Lj).

Hence, the value (i.e., ELULCj) in the map can be directly perceived as the change of fish species richness per km in the river (unit: num. species/km) due to the terrestrial LULC effect from pixel j (Eq. 7; see Fig. 3a).

$${E}_{{{{{\rm{LULC}}}}}_{{{{\rm{j}}}}}}={V}_{{{{{\rm{LULC}}}}}_{{{{\rm{j}}}}}}\cdot {\int }_{s={L}_{{{{\rm{ter}}}},{{{\rm{j}}}}}}^{{L}_{{{{\rm{ter}}}},{{{\rm{j}}}}}+{L}_{{{{\rm{j}}}}}}\frac{3}{r}{e}^{-\frac{3s}{r}}{ds}={V}_{{{{{\rm{LULC}}}}}_{{{{\rm{j}}}}}}\cdot {e}^{-\frac{3{L}_{{{{\rm{ter}}}},{{{\rm{j}}}}}}{r}}\cdot \left(1-{e}^{-\frac{3{L}_{{{{\rm{j}}}}}}{r}}\right).$$

(7)

Based on the previous bootstrapped samples, we predicted the terrestrial LULC effect map 2000 times and then calculated the interquartile range (IQR) as a metric of uncertainty (Fig. S14a, c).

Neutral meta-community (NMC) model as a null model

We compared our results with simulations based on a null model of a quasi-neutral river meta-community model, which considers climate, fish habitat capacity, speciation, extinction, migration, and river network structure31. This model uses meta-community theories and fish dispersal in the riverine network to predict fish species richness patterns. In the NMC simulation, the product of average annual runoff production (AARP) and watershed area (WA) was replaced by river discharge (Q) acquired from the HydroSHEDS data, as they represent similar meaning and have high correlations. We ran 30,000 random simulations with different sets of parameters over a meaningful range and derived the set of parameters that mostly fitted the fish species richness pattern and had the highest adjusted R2 (Table S4). Then, we compared its prediction error pattern with the error pattern of the FishDiv-LULC model (Fig. S6).

Projecting a riverine fish species richness map

We applied our model to major river channel pixels in the catchment to generate a riverine fish species richness map (Fig. 3b). To do so, we produced a local catchment within a 19 km spatial range for each major river channel pixel, followed by applying our FishDiv-LULC model to predict fish species richness in the river. We also assessed the uncertainty by predicting fish species richness based on the estimated parameters from 2000 bootstrapped samples and then computed the IQR of prediction results for major river channel pixels (Fig. S14b, d).

Fish traits in relation to terrestrial LULC

We collected ten major fish morphological traits from the FISHMORPH database34. They are maximum body length (MBl), body elongation (BEl), vertical eye position (VEp), relative eye size (REs), oral gape position (OGp), relative maxillary length (RMl), body lateral shape (BLs), pectoral fin vertical position (PFv), pectoral fin size (PFs), and caudal peduncle throttling (CPt), relating to fish metabolism, hydrodynamics, body size and shape, trophic levels and impacts, etc. Then, we linked fish traits and associated LULC type for each species according to the largest positive LULC effect value in the species-level modeling result. Species that could not establish species-level models were removed, so 93 out of 108 species were finally analyzed. Lastly, we mapped trait envelopes of LULC-associated fish species using a principal component analysis of ten morphological trait space.

Fish species richness changes in the past and future

To predict fish species richness patterns in the past and future, we used the ESA CCI land cover map in 1992 (the first year of the product) and the GLOBIO4 land use maps in 2050 as past and modeled future LULC maps, respectively23,37. The GLOBIO4 2050 land use maps are predicted based on the present ESA CCI land cover map, showing good consistency with the global LULC product used in our modeling. For 2050, we used three LULC maps under shared socio-economic pathway 1 representative concentration pathway 2.6 (SSP1 RCP2.6), SSP3 RCP6.0, and SSP5 RCP8.5 scenarios.

Due to the lack of differentiation between rainfed cropland and irrigated cropland in the future maps, we merged these two LULC types (four LULC types in total) and refitted the model to predict fish diversity changes in the past and future. The new validation is shown in Table S9. We predicted riverine fish species richness maps of 1992 and 2050 (under three scenarios), and then calculated the percentage of changes in the periods of 1992–2016 (past) and 2016–2050 (future) (Figs. 4 & S9).

Distribution changes of fish species of conservation concern (SPCC) in the past and future

We predicted the distribution map for each SPCC by firstly calculating a probability map in major river channels, and afterwards, determining presence/absence at each river channel pixel by a threshold with the highest true skill statistic value.

To assess the effects of LULC changes on SPCC, we also predicted the distribution maps for SPCCs in the past (1992) and forecast future fish distribution changes under three LULC change scenarios (2050). As a result, the distribution changes of SPCCs from 1992 to 2016 (past) and under three scenarios from 2016 to 2050 (future) are depicted in Figs. 4 & S9.

River water properties from remote sensing data

We estimated river water properties of chlorophyll-a content (Chl-a), total suspended solids (TSS), and dissolved organic carbon (DOC) to show nutrient/resource availability in rivers. To improve accuracy, image collections of Sentinel-2 level 2A surface reflectance data were used to obtain a 20-m cloud-free image on the Google Earth Engine. Then, we extracted major river channel pixels using a water occurrence map from the 30-m global surface water data with a threshold of 0.75, which effectively filtered out most river shoreline pixels53. After that, non-river-channel and narrow-channel (<~60 m) pixels were carefully removed manually, and the surface reflectance image was resampled to match the resolution of water occurrence data. Next, the dominant LULC type for each river pixel was determined within a 4-km circle. We computed Chl-a, TSS, and DOC for river pixels following well-established methods54,55,56, and plotted the water property values for dominant LULC types (Fig. 2). Detailed formulas of Chl-a, TSS, and DOC calculation can be found in the Supplementary Text.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.



Source link

More From Forest Beat

Cryptobenthic crab assemblages are more distinct across a 90 m depth gradient...

Graham, N. A. et al. Managing resilience to reverse phase shifts in coral reefs. Front. Ecol. Environ. 11, 541–548 (2013). ...
Biodiversity
13
minutes

Trade-off strategies between drought resistance and growth rate of dominant tree...

Study siteThe Guangxi Nonggang National Nature Reserve (The Nonggang Reserve), situated in the eastern Longzhou County and northern Ningming County, Guangxi Zhuang Autonomous...
Biodiversity
9
minutes

Passive acoustic monitoring reveals seasonal patterns in European green toad calling...

Study siteThe study sites were located within the urban area of Poznań, a city in western Poland with a human population of over...
Biodiversity
8
minutes

Changes with time post-restoration in the relationships between soil seed bank...

Guo, P. Y., Sun, F. Q. & Han, X. Y. Study on comprehensive evaluation of environmental pollution treatment effect in coal mine subsidence...
Biodiversity
8
minutes
spot_imgspot_img