Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
95 result(s) for "preferential sampling"
Sort by:
Effective strategies for correcting spatial sampling bias in species distribution models without independent test data
Aim Spatial sampling bias (SSB) is a feature of opportunistically sampled species records. Species distribution models (SDMs) built using these data (i.e. presence‐background models) can produce biased predictions of suitability across geographic space, confounding species occurrence with the distribution of sampling effort. A wide range of SSB correction methods have been developed but simulations suggest effects on predictive performance are highly variable. Here, we aim to identify the SSB correction methods that have the highest likelihood of improving true predictive performance and evaluation strategies that provide a reliable indicator of model performance when independent test data are unavailable. Location Global, simulation. Time Period Current, simulation. Methods A meta‐analysis was used to evaluate the performance of SSB correction methods in studies where there were direct comparisons between corrected and uncorrected SDMs. A simulation model was then developed to test evaluation strategies against a known truth using four common SSB correction methods. Results Effect sizes from published studies suggest some support for small positive effects of SSB correction on predictive performance when assessed using independent test data, but this was not evident using internal cross‐validation and no single method stood out as consistently effective. Simulations support these findings and show that evaluation using internal test data was generally a poor indicator of the true effect of SSB correction. Methods that adjust models relative to a known driver of SSB produced the largest performance gains, but were also the most inconsistent. Main Conclusions Correcting SSB in presence‐background SDMs without independent test data to evaluate the effect on model performance requires careful implementation. We recommend clearer documentation of SSB correction effects on SDMs, presenting results from models with and without correction, evaluating effects of different assumptions of SSB implementation on predictions, as well as greater efforts to collect independent test datasets to validate model predictions.
Volunteers Sample Where Endangered Bumble Bees Occur
Aim Many broad‐scale ecological inventory and monitoring efforts collect multi‐species (or otherwise multivariate) data under unstructured study designs. Unstructured designs are vulnerable to preferential sampling, where residual covariance between locations selected for sampling and the response variable of interest may render predictions strongly biased. Innovation We extend previous work to address preferential sampling in spatial single‐species distribution models to a multivariate context. Using spatially structured latent variables to approximate residual covariance between species occurrence probabilities and sampling inclusion probabilities, we present ways to account for sampling that may be preferential to varying degrees across multiple species, where (analogously) multiple datastreams might be preferential to varying degrees for a single species, or both. We use simulation to explore our proposed model and present an application that delineates the distributions of 13 bumble bee species across Wisconsin, USA and evaluates evidence for preferential sampling within 3 citizen science datastreams. Main Conclusions Simulation results suggest that our proposed model improves out‐of‐sample predictions of species occurrence or richness when the sampling design is preferential and residual covariance between sampling and species occurrence exhibits spatial structure compatible with model assumptions, reducing bias in predictions of species occurrence or richness. Empirically, volunteers appeared to sample preferentially with respect to bumble bee distributions, being more likely to sample in locations where the federally listed Bombus affinis was more likely to occur. Our approach enables practitioners a means to triage preferential sampling within increasingly popular multi‐species or integrated distribution models and can be modified slightly to deal with a variety of other response variables.
Dynamics of a long chain in turbulent flows: impact of vortices
We show and explain how a long bead–spring chain, immersed in a homogeneous isotropic turbulent flow, preferentially samples vortical flow structures. We begin with an elastic, extensible chain which is stretched out by the flow, up to inertial-range scales. This filamentary object, which is known to preferentially sample the circular coherent vortices of two-dimensional (2D) turbulence, is shown here to also preferentially sample the intense, tubular, vortex filaments of three-dimensional (3D) turbulence. In the 2D case, the chain collapses into a tracer inside vortices. In the 3D case on the contrary, the chain is extended even in vortical regions, which suggests that the chain follows axially stretched tubular vortices by aligning with their axes. This physical picture is confirmed by examining the relative sampling behaviour of the individual beads, and by additional studies on an inextensible chain with adjustable bending-stiffness. A highly flexible, inextensible chain also shows preferential sampling in three dimensions, provided it is longer than the dissipation scale, but not much longer than the vortex tubes. This is true also for 2D turbulence, where a long inextensible chain can occupy vortices by coiling into them. When the chain is made inflexible, however, coiling is prevented and the extent of preferential sampling in two dimensions is considerably reduced. In three dimensions, on the contrary, bending stiffness has no effect, because the chain does not need to coil in order to thread a vortex tube and align with its axis. This article is part of the theme issue ‘Fluid dynamics, soft matter and complex systems: recent results and new methods’.
Species distribution modeling: a statistical review with focus in spatio-temporal issues
The use of complex statistical models has recently increased substantially in the context of species distribution behavior. This complexity has made the inferential and predictive processes challenging to perform. The Bayesian approach has become a good option to deal with these models due to the ease with which prior information can be incorporated along with the fact that it provides a more realistic and accurate estimation of uncertainty. In this paper, we first review the sources of information and different approaches (frequentist and Bayesian) to model the distribution of a species. We also discuss the Integrated Nested Laplace approximation as a tool with which to obtain marginal posterior distributions of the parameters involved in these models. We finally discuss some important statistical issues that arise when researchers use species data: the presence of a temporal effect (presenting different spatial and spatio-temporal structures), preferential sampling, spatial misalignment, non-stationarity, imperfect detection, and the excess of zeros.
Geostatistical inference under preferential sampling
Geostatistics involves the fitting of spatially continuous models to spatially discrete data. Preferential sampling arises when the process that determines the data locations and the process being modelled are stochastically dependent. Conventional geostatistical methods assume, if only implicitly, that sampling is non-preferential. However, these methods are often used in situations where sampling is likely to be preferential. For example, in mineral exploration, samples may be concentrated in areas that are thought likely to yield high grade ore. We give a general expression for the likelihood function of preferentially sampled geostatistical data and describe how this can be evaluated approximately by using Monte Carlo methods. We present a model for preferential sampling and demonstrate through simulated examples that ignoring preferential sampling can lead to misleading inferences. We describe an application of the model to a set of biomonitoring data from Galicia, northern Spain, in which making allowance for preferential sampling materially changes the results of the analysis.
Modeling spatially biased citizen science effort through the eBird database
Citizen science databases are increasing in importance as sources of ecological information, but variability in effort across locations is inherent to such data. Spatially biased data—data not sampled uniformly across the study region—is expected. A further introduction of bias is variability in the level of sampling activity across locations. This motivates our work: with a spatial dataset of visited locations and sampling activity at those locations, we propose a model-based approach for assessing effort at these locations. Adjusting for potential spatial bias both in terms of sites visited and in terms of effort is crucial for developing reliable species distribution models (SDMs). Using data from eBird, a global citizen science database dedicated to avifauna, and illustrative regions in Pennsylvania and Germany, we model spatial dependence in both the observation locations and observed activity. We employ point process models to explain the observed locations in space, fit a geostatistical model to explain observation effort at locations, and explore the potential existence of preferential sampling, i.e., dependence between the two processes. Altogether, we offer a richer notion of sampling effort, combining information about location and activity. As SDMs are often used for their predictive capabilities, an important advantage of our approach is the ability to predict effort at unobserved locations and over regions. In this way, we can accommodate misalignment between point-referenced data and say, desired areal scale density. We briefly illustrate how our proposed methods can be applied to SDMs, with demonstrated improvement in prediction from models incorporating effort.
Model-Based Geostatistics Under Spatially Varying Preferential Sampling
Geostatistics is concerned with the estimation and prediction of spatially continuous phenomena using data obtained at a discrete set of locations. In geostatistics, preferential sampling occurs when these locations are not independent of the latent spatial field, and common modeling approaches that do not account for such a dependence structure might yield wrong inferences. To overcome this issue, some methods have been proposed to model data collected under preferential sampling. However, while these methods assume a constant degree of preferentiality, real data may present a degree of preferentiality that varies over space. For that reason, we propose a new model that accounts for preferential sampling by including a spatially varying coefficient that describes the dependence strength between the process that models the sampling locations and the latent field. To do so, we approximate the preferentiality component by a set of basis functions with the corresponding coefficients being estimated using the integrated nested Laplace approximation (INLA) method. By doing that, we allow the degree of preferentiality to vary over the domain with low computational burden. We assess our model performance by means of a simulation study and use it to analyze the average PM 2.5 levels in the USA in 2022. We conclude that, given enough observed events, our model, along with the implemented inference routine, retrieves well the latent field itself and the spatially varying preferentiality surface, even under misspecified scenarios. Also, we offer guidelines for the specification and size of the set of basis functions. Supplementary materials accompanying this paper appear online.
Methods for preferential sampling in geostatistics
Preferential sampling in geostatistics occurs when the locations at which observations are made may depend on the spatial process that underlines the correlation structure of the measurements. We show that previously proposed Monte Carlo estimates for the likelihood function may not be approximating the desired function. Furthermore, we argue that, for preferential sampling of moderate complexity, alternative and widely available numerical methods to approximate the likelihood function produce better results than Monte Carlo methods. We illustrate our findings on the Galicia data set analysed previously in the literature.
Identifying and correcting spatial bias in opportunistic citizen science data for wild ungulates in Norway
Many publications make use of opportunistic data, such as citizen science observation data, to infer large‐scale properties of species’ distributions. However, the few publications that use opportunistic citizen science data to study animal ecology at a habitat level do so without accounting for spatial biases in opportunistic records or using methods that are difficult to generalize. In this study, we explore the biases that exist in opportunistic observations and suggest an approach to correct for them. We first examined the extent of the biases in opportunistic citizen science observations of three wild ungulate species in Norway by comparing them to data from GPS telemetry. We then quantified the extent of the biases by specifying a model of the biases. From the bias model, we sampled available locations within the species’ home range. Along with opportunistic observations, we used the corrected availability locations to estimate a resource selection function (RSF). We tested this method with simulations and empirical datasets for the three species. We compared the results of our correction method to RSFs obtained using opportunistic observations without correction and to RSFs using GPS‐telemetry data. Finally, we compared habitat suitability maps obtained using each of these models. Opportunistic observations are more affected by human access and visibility than locations derived from GPS telemetry. This has consequences for drawing inferences about species’ ecology. Models naïvely using opportunistic observations in habitat‐use studies can result in spurious inferences. However, sampling availability locations based on the spatial biases in opportunistic data improves the estimation of the species’ RSFs and predicted habitat suitability maps in some cases. This study highlights the challenges and opportunities of using opportunistic observations in habitat‐use studies. While our method is not foolproof it is a first step toward unlocking the potential of opportunistic citizen science data for habitat‐use studies. We provide a novel method to use citizen science data for fine‐scale studies.
Correcting for informative sampling in spatial covariance estimation and kriging predictions
Informative sampling designs can impact spatial prediction, or kriging, in two important ways. First, the sampling design can bias spatial covariance parameter estimation, which in turn can bias spatial kriging estimates. Second, even with unbiased estimates of the spatial covariance parameters, since the kriging variance is a function of the observation locations, these estimates will vary based on the sample and overestimate the population-based estimates. In this work, we develop a weighted composite likelihood approach to improve spatial covariance parameter estimation under informative sampling designs. Then, given these parameter estimates, we propose three approaches to quantify the effects of the sampling design on the variance estimates in spatial prediction. These results can be used to make informed decisions for population-based inference. We illustrate our approaches using a comprehensive simulation study. Then, we apply our methods to perform spatial prediction using real estate data across a metropolitan area.