Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
6,677 result(s) for "Sampling bias"
Sort by:
Respondent-driven sampling bias induced by community structure and response rates in social networks
Sampling hidden populations is particularly challenging by using standard sampling methods mainly because of the lack of a sampling frame. Respondent-driven sampling is an alternative methodology that exploits the social contacts between peers to reach and weight individuals in these hard-to-reach populations. It is a snowball sampling procedure where the weight of the respondents is adjusted for the likelihood of being sampled due to differences in the number of contacts. The structure of the social contacts thus regulates the process by constraining the sampling within subregions of the network. We study the bias induced by network communities, which are groups of individuals more connected between themselves than with individuals in other groups, in the respondent-driven sampling estimator. We simulate different structures and response rates to reproduce real settings. We find that the prevalence of the estimated variable is associated with the size of the network community to which the individual belongs and observe that low degree nodes may be undersampled if the sample and the network are of similar size. We also find that respondent-driven sampling estimators perform well if response rates are relatively large and the community structure is weak, whereas low response rates typically generate strong biases irrespectively of the community structure.
Widespread sampling biases in herbaria revealed from large-scale digitization
Nonrandom collecting practices may bias conclusions drawn from analyses of herbarium records. Recent efforts to fully digitize and mobilize regional floras online offer a timely opportunity to assess commonalities and differences in herbarium sampling biases. We determined spatial, temporal, trait, phylogenetic, and collector biases in c. 5 million herbarium records, representing three of the most complete digitized floras of the world: Australia (AU), South Africa (SA), and New England, USA (NE). We identified numerous shared and unique biases among these regions. Shared biases included specimens collected close to roads and herbaria; specimens collected more frequently during biological spring and summer; specimens of threatened species collected less frequently; and specimens of close relatives collected in similar numbers. Regional differences included overrepresentation of graminoids in SA and AU and of annuals in AU; and peak collection during the 1910s in NE, 1980s in SA, and 1990s in AU. Finally, in all regions, a disproportionately large percentage of specimens were collected by very few individuals. We hypothesize that these mega-collectors, with their associated preferences and idiosyncrasies, shaped patterns of collection bias via ‘founder effects’. Studies using herbarium collections should account for sampling biases, and future collecting efforts should avoid compounding these biases to the extent possible.
Recommendations for examining and interpreting funnel plot asymmetry in meta-analyses of randomised controlled trials
Funnel plots, and tests for funnel plot asymmetry, have been widely used to examine bias in the results of meta-analyses. Funnel plot asymmetry should not be equated with publication bias, because it has a number of other possible causes. This article describes how to interpret funnel plot asymmetry, recommends appropriate tests, and explains the implications for choice of meta-analysis model
Endogenous Selection Bias: The Problem of Conditioning on a Collider Variable
Endogenous selection bias is a central problem for causal inference. Recognizing the problem, however, can be difficult in practice. This article introduces a purely graphical way of characterizing endogenous selection bias and of understanding its consequences (Hernán et al. 2004). We use causal graphs (direct acyclic graphs, or DAGs) to highlight that endogenous selection bias stems from conditioning (e.g., controlling, stratifying, or selecting) on a so-called collider variable, i.e., a variable that is itself caused by two other variables, one that is (or is associated with) the treatment and another that is (or is associated with) the outcome. Endogenous selection bias can result from direct conditioning on the outcome variable, a post-outcome variable, a post-treatment variable, and even a pre-treatment variable. We highlight the difference between endogenous selection bias, common-cause confounding, and overcontrol bias and discuss numerous examples from social stratification, cultural sociology, social network analysis, political sociology, social demography, and the sociology of education.
Inferring trends in pollinator distributions across the Neotropics from publicly available data remains challenging despite mobilization efforts
Aim Aggregated species occurrence data are increasingly accessible through public databases for the analysis of temporal trends in the geographic distributions of species. However, biases in these data present challenges for statistical inference. We assessed potential biases in data available through GBIF on the occurrences of four flower‐visiting taxa: bees (Anthophila), hoverflies (Syrphidae), leaf‐nosed bats (Phyllostomidae) and hummingbirds (Trochilidae). We also assessed whether and to what extent data mobilization efforts improved our ability to estimate trends in species' distributions. Location The Neotropics. Methods We used five data‐driven heuristics to screen the data for potential geographic, temporal and taxonomic biases. We began with a continental‐scale assessment of the data for all four taxa. We then identified two recent data mobilization efforts (2021) that drastically increased the quantity of records of bees collected in Chile available through GBIF. We compared the dataset before and after the addition of these new records in terms of their biases and estimated trends in species' distributions. Results We found evidence of potential sampling biases for all taxa. The addition of newly‐mobilized records of bees in Chile decreased some biases but introduced others. Despite increasing the quantity of data for bees in Chile sixfold, estimates of trends in species' distributions derived using the postmobilization dataset were broadly similar to what would have been estimated before their introduction, albeit more precise. Main conclusions Our results highlight the challenges associated with drawing robust inferences about trends in species' distributions using publicly available data. Mobilizing historic records will not always enable trend estimation because more data do not necessarily equal less bias. Analysts should carefully assess their data before conducting analyses: this might enable the estimation of more robust trends and help to identify strategies for effective data mobilization. Our study also reinforces the need for targeted monitoring of pollinators worldwide.
Sample Selection Bias and Presence-Only Distribution Models: Implications for Background and Pseudo-Absence Data
Most methods for modeling species distributions from occurrence records require additional data representing the range of environmental conditions in the modeled region. These data, called background or pseudo-absence data, are usually drawn at random from the entire region, whereas occurrence collection is often spatially biased toward easily accessed areas. Since the spatial bias generally results in environmental bias, the difference between occurrence collection and background sampling may lead to inaccurate models. To correct the estimation, we propose choosing background data with the same bias as occurrence data. We investigate theoretical and practical implications of this approach. Accurate information about spatial bias is usually lacking, so explicit biased sampling of background sites may not be possible. However, it is likely that an entire target group of species observed by similar methods will share similar bias. We therefore explore the use of all occurrences within a target group as biased background data. We compare model performance using target-group background and randomly sampled background on a comprehensive collection of data for 226 species from diverse regions of the world. We find that target-group background improves average performance for all the modeling methods we consider, with the choice of background data having as large an effect on predictive performance as the choice of modeling method. The performance improvement due to target-group background is greatest when there is strong bias in the target-group presence records. Our approach applies to regression-based modeling methods that have been adapted for use with occurrence data, such as generalized linear or additive models and boosted regression trees, and to Maxent, a probability density estimation method. We argue that increased awareness of the implications of spatial bias in surveys, and possible modeling remedies, will substantially improve predictions of species distributions.
The impact of non-response bias due to sampling in public health studies: A comparison of voluntary versus mandatory recruitment in a Dutch national survey on adolescent health
Background In public health monitoring of young people it is critical to understand the effects of selective non-response, in particular when a controversial topic is involved like substance abuse or sexual behaviour. Research that is dependent upon voluntary subject participation is particularly vulnerable to sampling bias. As respondents whose participation is hardest to elicit on a voluntary basis are also more likely to report risk behaviour, this potentially leads to underestimation of risk factor prevalence. Inviting adolescents to participate in a home-sent postal survey is a typical voluntary recruitment strategy with high non-response, as opposed to mandatory participation during school time. This study examines the extent to which prevalence estimates of adolescent health-related characteristics are biased due to different sampling methods, and whether this also biases within-subject analyses. Methods Cross-sectional datasets collected in 2011 in Twente and IJsselland, two similar and adjacent regions in the Netherlands, were used. In total, 9360 youngsters in a mandatory sample (Twente) and 1952 youngsters in a voluntary sample (IJsselland) participated in the study. To test whether the samples differed on health-related variables, we conducted both univariate and multivariable logistic regression analyses controlling for any demographic difference between the samples. Additional multivariable logistic regressions were conducted to examine moderating effects of sampling method on associations between health-related variables. Results As expected, females, older individuals, as well as individuals with higher education levels, were over-represented in the voluntary sample, compared to the mandatory sample. Respondents in the voluntary sample tended to smoke less, consume less alcohol (ever, lifetime, and past four weeks), have better mental health, have better subjective health status, have more positive school experiences and have less sexual intercourse than respondents in the mandatory sample. No moderating effects were found for sampling method on associations between variables. Conclusions This is one of first studies to provide strong evidence that voluntary recruitment may lead to a strong non-response bias in health-related prevalence estimates in adolescents, as compared to mandatory recruitment. The resulting underestimation in prevalence of health behaviours and well-being measures appeared large, up to a four-fold lower proportion for self-reported alcohol consumption. Correlations between variables, though, appeared to be insensitive to sampling bias.
Target-group backgrounds prove effective at correcting sampling bias in Maxent models
Aim Accounting for sampling bias is the greatest challenge facing presence‐only and presence‐background species distribution models; no matter what type of model is chosen, using biased data will mask the true relationship between occurrences and environmental predictors. To address this issue, we review four established bias correction techniques, using empirical occurrences with known sampling effort, and virtual species with known distributions. Innovation Occurrence data come from a national recording scheme of hoverflies (Syrphidae) in Great Britain, spanning 1983–2002. Target‐group backgrounds, distance‐restricted backgrounds, travel time to cities and human population density were used to account for sampling bias in 58 species of hoverfly. Distributions generated by bias correction techniques were compared in geographical space to the distribution produced accounting for known sampling effort, using Schoener's distance, centroid shifts and range size changes. To validate our results, we performed the same comparisons using 50 randomly generated virtual species. We used sampling effort from the hoverfly recording scheme to structure our biased sampling regime, emulating complex real‐life sampling bias. Main conclusions Models made without any correction typically produced distributions that mapped sampling effort rather than the underlying habitat suitability. Target‐group backgrounds performed the best at emulating sampling effort and unbiased virtual occurrences, but also showed signs of overcompensation in places. Other methods performed better than no‐correction, but often differences were difficult to visually detect. In line with previous studies, when sampling effort is unknown, target‐group backgrounds provide a useful tool for reducing the effect of sampling bias. Models should be visually inspected for biological realism to identify any areas of potential overcompensation. Given the disparity between corrected and un‐corrected models, sampling bias constitutes a major source of error in species distribution modelling, and more research is needed to confidently address the issue.
Bunching up the background betters bias in species distribution models
Sets of presence records used to model species’ distributions typically consist of observations collected opportunistically rather than systematically. As a result, sampling probability is geographically uneven, which may confound the model's characterization of the species’ distribution. Modelers frequently address sampling bias by manipulating training data: either subsampling presence data or creating a similar spatial bias in non‐presence background data. We tested a new method, which we call ‘background thickening’, in the latter category. Background thickening entails concentrating background locations around presence locations in proportion to presence location density. We compared background thickening to two established sampling bias correction methods – target group background selection and presence thinning – using simulated data and data from a case study. In the case study, background thickening and presence thinning performed similarly well, both producing better model discrimination than target group background selection, and better model calibration than models without correction. In the simulation, background thickening performed better than presence thinning when the number of simulated presence locations was low, and vice versa. We discuss drawbacks to target group background selection, why background thickening and presence thinning are conservative but robust sampling bias correction methods, and why background thickening is better than presence thinning for small sample sizes. Particularly, background thickening is advantageous for treating sampling bias when data are scarce because it avoids discarding presence records.
The Impact of Nonresponse Rates on Nonresponse Bias: A Meta-Analysis
Fifty-nine methodological studies were designed to estimate the magnitude of nonresponse bias in statistics of interest. These studies use a variety of designs: sampling frames with rich variables, data from administrative records matched to sample case, use of screening-interview data to describe nonrespondents to main interviews, followup of nonrespondents to initial phases of field effort, and measures of behavior intentions to respond to a survey. This permits exploration of which circumstances produce a relationship between nonresponse rates and nonresponse bias and which, do not. The predictors are design features of the surveys, characteristics of the sample, and attributes of the survey statistics computed in the surveys.