2020 journal article

Estimating the drivers of species distributions with opportunistic data using mediation analysis

ECOSPHERE, 11(6).

By: D. Huberman n, B. Reich n, K. Pacifici n & J. Collazo n

author keywords: mediation analysis; occupancy modeling; opportunistic data; spatial statistics
UN Sustainable Development Goal Categories
13. Climate Action (Web of Science)
14. Life Below Water (Web of Science)
15. Life on Land (Web of Science; OpenAlex)
Source: Web Of Science
Added: August 10, 2020

AbstractEcological occupancy modeling has historically relied on high‐quality, low‐quantity designed‐survey data for estimation and prediction. In recent years, there has been a large increase in the amount of high‐quantity, unknown‐quality opportunistic data. This has motivated research on how best to combine these two data sources in order to optimize inference. Existing methods can be infeasible for large datasets or require opportunistic data to be located where designed‐survey data exist. These methods map species occupancies, motivating a need to properly evaluate covariate effects (e.g., land cover proportion) on their distributions. We describe a spatial estimation method for supplementarily including additional opportunistic data using mediation analysis concepts. The opportunistic data mediate the effect of the covariate on the designed‐survey data response, decomposing it into a direct and indirect effect. A component of the indirect effect can then be quickly estimated via regressing the mediator on the covariate, while the other components are estimated through a spatial occupancy model. The regression step allows for use of large quantities of opportunistic data that can be collected in locations with no designed‐survey data available. Simulation results suggest that the mediated method produces an improvement in relative MSE when the data are of reasonable quality. However, when the simulated opportunistic data are poorly correlated with the true spatial process, the standard, unmediated method is still preferable. A spatiotemporal extension of the method is also developed for analyzing the effect of deciduous forest land cover on red‐eyed vireo distribution in the southeastern United States and find that including the opportunistic data do not lead to a substantial improvement. Opportunistic data quality remains an important consideration when employing this method, as with other data integration methods.