Biodiversity data can be analysed to predict species distribution at various scales of time and space. However, survey completeness and temporal decay in data quality introduce uncertainty into biodiversity models. Researchers Joaquín Hortal, Juliana Stropp (National Museum of Natural Sciences, Spain), Richard Ladle (University of Porto, Portugal), and Geiziane Tessarolo (State University of Goiás, Brazil), among others, are constructing the first Maps of Biogeographical Ignorance (MoBIs) that account for uncertainty in biodiversity analysis. Presented alongside species distribution models (SDMs), MoBIs ensure conservation resources are appropriately distributed.
In a technologically advanced society where data can be collected remotely, almost continuously, and at a low cost, biodiversity research has become increasingly data-driven. Biodiversity data – that is, records documenting species occurrence – can be readily collected from field surveys to assess the relationships between biodiversity and the environment. Scientists, conservationists, and policy-makers can use these data to understand where species are, why they are there, how their distributions might change under future climate or land-use scenarios, and how these changes may affect ecosystem services – the various benefits ecosystems provide to mankind, including food and water and improving human resilience to extreme events like droughts or floods. With this knowledge, impacts of global change on biodiversity can be forecast, and species and habitats can be better protected.
Species distribution modelling
However, conservation science is a crisis discipline and data collection is not able to keep up with rapid decision-making requirements. Old, limited, and incomplete data are frequently used as a ‘best guess’ in times of need. The challenge is even greater when data is required over particularly wide-ranging or long-running scales due to the limited resources available to biodiversity research. In the absence of consistently reliable and widespread observational data, scientists rely on methods to estimate distributions from smaller amounts of data. The most widely used are the aptly named species distribution models (SDMs).
SDMs are mathematical models that relate the observed distribution of a species to a set of environmental predictor variables (eg, climate, habitat type, distance to water). Despite their widespread use in species and habitat conservation, these models contain a degree of uncertainty associated with the predictions.
Sources of inaccuracy
The accuracy of any analysis or prediction depends on the quality of the original data. In a paper published in 2013, Joaquín Hortal (National Museum of Natural Sciences, MNCN-CSIC, Spain) and Richard Ladle (CIBIO-InBIO, University of Porto, Portugal) identified three major factors that compromise the quality of species distribution knowledge at any given spatial scale: survey completeness, the decay of information with time, and the decay of information with space. These decays account for the progressive loss of value of data to predict biodiversity trends at increasing distance from the observations.
Survey completeness can be impacted by the difficulty of detecting species due to their characteristics – such as cryptic colouration, elusive behaviour, or high mobility – and the characteristics of their habitat. As a result, even biological surveys employing a range of sampling methods can be incomplete. Hortal, this time working together with Juliana Stropp (MNCN-CSIC) and collaborators, found that sampling sites for butterflies in the Brazilian Atlantic Forest were biased towards large and connected forest fragments (Sobral-Souza et al, 2020). In an SDM constructed with incomplete survey data like this, the inference of the functional relationship between butterflies and deforestation would be limited to these well-sampled areas. Increasing sampling effort in small and disconnected forest fragments would lead to more accurate evaluations of landscape-scale effects in the future that can be used by conservation decision-makers. A similar study on Iberian mosses (Ronquillo et al, 2020) found that there were also spatial gaps and biases in the dataset and dates were lacking from the metadata of many records, further limiting analysis in temporal biodiversity shifts associated with global change stressors.
The decay of biological information with time and space comes about because species are continuously responding to ongoing environmental changes. Changes over time include climate change, land-use change, habitat degradation, and biological invasions. For example, by 2017 in the Brazilian Amazon, 30% of all locations where tree specimens had historically been recorded were lost to deforestation (Stropp et al, 2020). A further 300,000 km2 of rainforest was deforested by 2017 without having a single tree specimen recorded. Lack of historical data about which species occurred in deforested areas presents a particularly grave problem because this type of information is irretrievable. Changes in space include climate, habitat type, soil chemistry, level of disturbance, and community composition. As such, the certainty of SDM predictions decreases as the spatial and temporal distance between data collection location and a modelled area increases.
Conservation science is a crisis discipline and data collection is not able to keep up with rapid decision-making requirements.
To maximise policy and conservation impact, maps of uncertainty or inaccuracy within SDMs should be displayed alongside the outputs. Research by Hortal, Ladle, Geiziane Tessarolo (State University of Goiás, Brazil) and colleagues, produced the first Maps of Biogeographical Ignorance (MoBIs) that do just this – they account for survey completeness and quality of data in each area, as well as the decay in information (Tessarolo et al, 2021). Highlighting these uncertainties helps identify future conservation priorities. They can determine areas with more reliable predictions and support decision-makers using SDMs, for example in the allocation of future conservation priorities. Regularly updating the combined distribution and ignorance map would provide users with the information required to deal with biogeographical uncertainty through the design and implementation of new surveys or by incorporating spatially explicit estimates of uncertainty into future SDMs. A combined approach would also allow conservation decision-makers to prioritise the existing conservation needs of species and allocate resources effectively, for example by focusing resources to survey and conserve the biodiversity of areas that are vulnerable to deforestation or climate change but remain poorly sampled.
Proof of concept
To produce MoBIs, the research group combined analytical and visualisation tools to represent four main sources of biological ignorance – data completeness, taxonomic quality of data, temporal decay in information, and spatial and environmental distance to the surveyed site – in a Scarabaeidae dung beetle dataset from the Iberian Peninsula. A biogeographical ignorance value was calculated for each record, generating a MoBI for each species. This ignorance value represents the degree of reliability of the biogeographical information for a species in a location. So the higher the value, the less reliable the information in the area is.
A biogeographical ignorance value for the focal species is attributed to each cell of the studied area depending on whether the cell contains original data on the focal species, a species from the studied taxonomic group, or neither of these. If the cell contains a record for the focal species, the biogeographical ignorance value is taken from the most reliable record. If the cell contains a complete record of species of the same taxonomic group (species that can be captured with similar sampling protocols), the absence of the focal species is likely ‘real’, so the biogeographical ignorance value is taken from the mean minimum of each species in the cell. For cells without data for the studied species group, environmental and spatial proximity to cells with data is the only way to calculate ignorance. The researchers combined SDM maps with MoBIs to visually represent the reliability of distribution knowledge.
The study described four extreme (high or low) combinations of suitability and ignorance: 1) areas with high prediction suitability for the focal species to occur and high ignorance; 2) areas of high suitability and low ignorance; 3) areas with low predicted suitability and low ignorance; and 4) areas of low suitability and high ignorance. In terms of conservation priority based on these outputs, areas of high ignorance would benefit from further surveys, and areas of low ignorance could be targeted with conservation management to increase suitability or preserve habitat, if they are predicted to be home of the focal species.
Ignorance maps that are combined with SDM predictions generate spatially explicit maps of uncertainty and reveal insights into the reliability of such commonly used maps. Neglecting this uncertainty can affect SDM interpretation and thus potentially incorrect theoretical or practical applications, including ill-advised conservation actions. By routinely using maps of ignorance or similar techniques, conservation action can become more effective. In a resource-deficient discipline such as conservation science, it pays to understand the data and its inaccuracies before making conservation decisions that could waste valuable resources or lose invaluable habitats and species.
What is next for your research group?
First, we have to adapt the parameters used to generate MoBIs to different case studies, extending their application to the particularities of as many types of species, communities and habitats as possible. In parallel, we will develop easy-to-use software tools to calculate MoBIs from biodiversity databases, and incorporate them into SDM software suites tailored to both model data and display its uncertainty. This will help integrate MoBIs as spatially explicit error terms into particular SDM applications, so they account for data-driven uncertainty. These novel tools will be used to assess the biogeographical ignorance in the tropical biomes of Africa and South America through TROPIBIO and TAXON-TIME projects, led by Richard Ladle and Juliana Stropp, respectively.