2022 report
Methods for evaluating Gap Analysis Project habitat distribution maps with species occurrence data
First posted August 29, 2022 For additional information, contact: Director, Core Science Analytics and SynthesisU.S. Geological SurveyBox 25046, Mail Stop 302Denver, CO 80225 The National Gap Analysis Project created species habitat distribution models for all terrestrial vertebrates in the United States to support conservation assessments and explore patterns of species richness. Those models link species to specific habitats throughout the range of each species. For most vertebrates, there are not enough occurrence data to drive inductive, range-wide species habitat distribution models at high spatial and thematic resolution. However, it is possible to use occurrence data for model evaluation. The combination of citizen science, formal species survey work, and digitized specimen archives are making millions of observations available to the scientific community. Our challenge is to combine the mostly unstructured data into metrics that help us characterize and understand patterns of biodiversity. In this work, we propose two model-evaluation metrics. The first, a buffer proportion assessment, is based on the proportion of habitat in the range relative to the mean proportion of habitat around each of the species’ occurrence records. The second is a measure of the sensitivity (proportion of true presence) to buffer distances around occurrence records. The buffer proportion is a modification of model prevalence versus point prevalence metric, whereby comparison to a null model allows us to determine if the model performs better or worse than random.In this report, we describe the workflow used to compile and filter the species occurrence records from online resources (for example, the Global Biodiversity Information Facility) and show results for a single species, Desmognathus quadramaculatus (black-bellied salamander). For the salamander, 222 occurrence points met our criteria for inclusion in the evaluation. We found the model performed better than random with a buffer proportion index of 1.745, indicating about 5 times as much habitat was found adjacent to known occurrence records than would be expected from randomly located sites throughout the range. Sensitivity increased with larger buffer distances and leveled off to around 0.7 between 1,000- and 2,000-meter buffer distances, indicating the model is likely best suited for scales exceeding 1,000 meters. We plan to report the buffer proportion assessment and sensitivity metrics along with the full species model reports to increase understanding of the model’s performance and to use the metrics to help prioritize revisions to the models.