Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
22,952
result(s) for
"Survey sampling"
Sort by:
Speeding Up MCMC by Efficient Data Subsampling
by
Villani, Mattias
,
Quiroz, Matias
,
Tran, Minh-Ngoc
in
Algorithms
,
Americans
,
Bayesian inference
2019
We propose subsampling Markov chain Monte Carlo (MCMC), an MCMC framework where the likelihood function for n observations is estimated from a random subset of m observations. We introduce a highly efficient unbiased estimator of the log-likelihood based on control variates, such that the computing cost is much smaller than that of the full log-likelihood in standard MCMC. The likelihood estimate is bias-corrected and used in two dependent pseudo-marginal algorithms to sample from a perturbed posterior, for which we derive the asymptotic error with respect to n and m, respectively. We propose a practical estimator of the error and show that the error is negligible even for a very small m in our applications. We demonstrate that subsampling MCMC is substantially more efficient than standard MCMC in terms of sampling efficiency for a given computational budget, and that it outperforms other subsampling methods for MCMC proposed in the literature. Supplementary materials for this article are available online.
Journal Article
Diagnostics for respondent-driven sampling
by
Johnston, Lisa G.
,
Salganik, Matthew J.
,
Gile, Krista J.
in
Acquired immune deficiency syndrome
,
AIDS
,
Coupons
2015
Respondent-driven sampling (RDS) is a widely used method for sampling from hard-to-reach human populations, especially populations at higher risk for human immunodeficiency virus or acquired immune deficiency syndrome. Data are collected through a peer referral process over social networks. RDS has proven practical for data collection in many difficult settings and has been adopted by leading public health organizations around the world. Unfortunately, inference from RDS data requires many strong assumptions because the sampling design is partially beyond the control of the researcher and not fully observable. We introduce diagnostic tools for most of these assumptions and apply them in 12 high risk populations. These diagnostics empower researchers to understand their RDS data better and encourage future statistical research on RDS sampling and inference.
Journal Article
Improved Inference for Respondent-Driven Sampling Data With Application to HIV Prevalence Estimation
2011
Respondent-driven sampling is a form of link-tracing network sampling, which is widely used to study hard-to-reach populations, often to estimate population proportions. Previous treatments of this process have used a with-replacement approximation, which we show induces bias in estimates for large sample fractions and differential network connectedness by characteristic of interest. We present a treatment of respondent-driven sampling as a successive sampling process. Unlike existing representations, our approach respects the essential without-replacement feature of the process, while converging to an existing with-replacement representation for small sample fractions, and to the sample mean for a full-population sample. We present a successive-sampling based estimator for population means based on respondent-driven sampling data, and demonstrate its superior performance when the size of the hidden population is known. We present sensitivity analyses for unknown population sizes. In addition, we note that like other existing estimators, our new estimator is subject to bias induced by the selection of the initial sample. Using data collected among three populations in two countries, we illustrate the application of this approach to populations with varying characteristics. We conclude that the successive sampling estimator improves on existing estimators, and can also be used as a diagnostic tool when population size is not known. This article has supplementary material online.
Journal Article
Estimating the size of populations at high risk for HIV using respondent‐driven sampling data
by
Handcock, Mark S.
,
Gile, Krista J.
,
Mar, Corinne M.
in
Acquired immune deficiency syndrome
,
AIDS
,
BIOMETRIC PRACTICE
2015
The study of hard‐to‐reach populations presents significant challenges. Typically, a sampling frame is not available, and population members are difficult to identify or recruit from broader sampling frames. This is especially true of populations at high risk for HIV/AIDS. Respondent‐driven sampling (RDS) is often used in such settings with the primary goal of estimating the prevalence of infection. In such populations, the number of people at risk for infection and the number of people infected are of fundamental importance. This article presents a case‐study of the estimation of the size of the hard‐to‐reach population based on data collected through RDS. We study two populations of female sex workers and men‐who‐have‐sex‐with‐men in El Salvador. The approach is Bayesian and we consider different forms of prior information, including using the UNAIDS population size guidelines for this region. We show that the method is able to quantify the amount of information on population size available in RDS samples. As separate validation, we compare our results to those estimated by extrapolating from a capture–recapture study of El Salvadorian cities. The results of our case‐study are largely comparable to those of the capture–recapture study when they differ from the UNAIDS guidelines. Our method is widely applicable to data from RDS studies and we provide a software package to facilitate this.
Journal Article
Imaging features and safety and efficacy of endovascular stroke treatment: a meta-analysis of individual patient-level data
by
Ringleb, P
,
Reiff, T
,
Hopyan, J
in
62 Statistics
,
62D05 Sampling theory, sample surveys
,
65 Numerical analysis
2018
Evidence regarding whether imaging can be used effectively to select patients for endovascular thrombectomy (EVT) is scarce. We aimed to investigate the association between baseline imaging features and safety and efficacy of EVT in acute ischaemic stroke caused by anterior large-vessel occlusion.
In this meta-analysis of individual patient-level data, the HERMES collaboration identified in PubMed seven randomised trials in endovascular stroke that compared EVT with standard medical therapy, published between Jan 1, 2010, and Oct 31, 2017. Only trials that required vessel imaging to identify patients with proximal anterior circulation ischaemic stroke and that used predominantly stent retrievers or second-generation neurothrombectomy devices in the EVT group were included. Risk of bias was assessed with the Cochrane handbook methodology. Central investigators, masked to clinical information other than stroke side, categorised baseline imaging features of ischaemic change with the Alberta Stroke Program Early CT Score (ASPECTS) or according to involvement of more than 33% of middle cerebral artery territory, and by thrombus volume, hyperdensity, and collateral status. The primary endpoint was neurological functional disability scored on the modified Rankin Scale (mRS) score at 90 days after randomisation. Safety outcomes included symptomatic intracranial haemorrhage, parenchymal haematoma type 2 within 5 days of randomisation, and mortality within 90 days. For the primary analysis, we used mixed-methods ordinal logistic regression adjusted for age, sex, National Institutes of Health Stroke Scale score at admission, intravenous alteplase, and time from onset to randomisation, and we used interaction terms to test whether imaging categorisation at baseline modifies the association between treatment and outcome. This meta-analysis was prospectively designed by the HERMES executive committee but has not been registered.
Among 1764 pooled patients, 871 were allocated to the EVT group and 893 to the control group. Risk of bias was low except in the THRACE study, which used unblinded assessment of outcomes 90 days after randomisation and MRI predominantly as the primary baseline imaging tool. The overall treatment effect favoured EVT (adjusted common odds ratio [cOR] for a shift towards better outcome on the mRS 2·00, 95% CI 1·69–2·38; p<0·0001). EVT achieved better outcomes at 90 days than standard medical therapy alone across a broad range of baseline imaging categories. Mortality at 90 days (14·7% vs 17·3%, p=0·15), symptomatic intracranial haemorrhage (3·8% vs 3·5%, p=0·90), and parenchymal haematoma type 2 (5·6% vs 4·8%, p=0·52) did not differ between the EVT and control groups. No treatment effect modification by baseline imaging features was noted for mortality at 90 days and parenchymal haematoma type 2. Among patients with ASPECTS 0–4, symptomatic intracranial haemorrhage was seen in ten (19%) of 52 patients in the EVT group versus three (5%) of 66 patients in the control group (adjusted cOR 3·94, 95% CI 0·94–16·49; pinteraction=0·025), and among patients with more than 33% involvement of middle cerebral artery territory, symptomatic intracranial haemorrhage was observed in 15 (14%) of 108 patients in the EVT group versus four (4%) of 113 patients in the control group (4·17, 1·30–13·44, pinteraction=0·012).
EVT achieves better outcomes at 90 days than standard medical therapy across a broad range of baseline imaging categories, including infarcts affecting more than 33% of middle cerebral artery territory or ASPECTS less than 6, although in these patients the risk of symptomatic intracranial haemorrhage was higher in the EVT group than the control group. This analysis provides preliminary evidence for potential use of EVT in patients with large infarcts at baseline.
Medtronic.
Journal Article
Using Social Networks to Sample Migrants and Study the Complexity of Contemporary Immigration: An Evaluation Study
by
Le Barbenchon, Claire
,
Merli, M. Giovanna
,
Stolte, Allison
in
Asian cultural groups
,
Cost analysis
,
Demography
2022
We test the effectiveness of a link-tracing sampling approach—network sampling with memory (NSM)—to recruit samples of rare immigrant populations with an application among Chinese immigrants in the Raleigh-Durham area of North Carolina. NSM uses the population network revealed by data from the survey to improve the efficiency of link-tracing sampling and has been shown to substantially reduce design effects in simulated sampling. Our goals are to (1) show that it is possible to recruit a probability sample of a locally rare immigrant group using NSM and achieve high response rates; (2) demonstrate the feasibility of the collection and benefits of new forms of network data that transcend kinship networks in existing surveys and can address unresolved questions about the role of social networks in migration decisions, the maintenance of transnationalism, and the process of social incorporation; and (3) test the accuracy of the NSM approach for recruiting immigrant samples by comparison with the American Community Survey. Our results indicate feasibility, high performance, cost-effectiveness, and accuracy of the NSM approach to sample immigrants for studies of local immigrant communities. This approach can also be extended to recruit multisite samples of immigrants at origin and destination.
Journal Article
Responsive design for household surveys: tools for actively controlling survey errors and costs
by
Heeringa, Steven G.
,
Groves, Robert M.
in
Applications
,
Biology, psychology, social sciences
,
Cost efficiency
2006
Over the past few years surveys have expanded to new populations, have incorporated measurement of new and more complex substantive issues and have adopted new data collection tools. At the same time there has been a growing reluctance among many household populations to participate in surveys. These factors have combined to present survey designers and survey researchers with increased uncertainty about the performance of any given survey design at any particular point in time. This uncertainty has, in turn, challenged the survey practitioner's ability to control the cost of data collection and quality of resulting statistics. The development of computer-assisted methods for data collection has provided survey researchers with tools to capture a variety of process data ('paradata') that can be used to inform cost-quality trade-off decisions in realtime. The ability to monitor continually the streams of process data and survey data creates the opportunity to alter the design during the course of data collection to improve survey cost efficiency and to achieve more precise, less biased estimates. We label such surveys as 'responsive designs'. The paper defines responsive design and uses examples to illustrate the responsive use of paradata to guide mid-survey decisions affecting the non-response, measurement and sampling variance properties of resulting statistics.
Journal Article
Surveying migrant households: a comparison of census-based, snowball and intercept point surveys
2009
Few representative surveys of households of migrants exist, limiting our ability to study the effects of international migration on sending families. We report the results of an experiment that was designed to compare the performance of three alternative survey methods in collecting data from Japanese-Brazilian families, many of whom send migrants to Japan. The three surveys that were conducted were households selected randomly from a door-to-door listing using the Brazilian census to select census blocks, a snowball survey using Nikkei community groups to select the seeds and an intercept point survey that was collected at Nikkei community gatherings, ethnic grocery stores, sports clubs and other locations where family members of migrants are likely to congregate. We analyse how closely well-designed snowball and intercept point surveys can approach the much more expensive census-based method in terms of giving information on the characteristics of migrants, the level of remittances received and the incidence and determinants of return migration.
Journal Article
Pseudo-Empirical Likelihood Inference for Multiple Frame Surveys
2010
This article presents a pseudo-empirical likelihood approach to inference for multiple-frame surveys. We establish a unified framework for point and interval estimation of finite population parameters, and show that inferences on the parameters of interest making effective use of different types of auxiliary population information can be conveniently carried out through the constrained maximization of the pseudo-empirical likelihood function. Confidence intervals are constructed using either the asymptotic χ
2
distribution of an adjusted pseudo-empirical likelihood ratio statistic or a bootstrap calibration method. Simulation results based on Statistics Canada's Family Expenditure Survey data show that the proposed methods perform well in finite samples for both point and interval estimation. In particular, a multiplicity-based pseudo-empirical likelihood method is proposed. This method is easily used for multiple-frame surveys with more than two frames and does not require complete frame membership information. The proposed pseudo-empirical likelihood ratio confidence intervals have a clear advantage over the conventional normal approximation-based intervals in estimating population proportions of rare items, a scenario that often motivates the use of multiple-frame surveys. All related computational problems can be handled using existing algorithms for pseudo-empirical likelihood methods with single-frame surveys.
Journal Article
Facing the Nonresponse Challenge
2013
This article provides a brief overview of key trends in the survey research to address the nonresponse challenge. Noteworthy are efforts to develop new quality measures and to combine several data sources to enhance either the data collection process or the quality of resulting survey estimates. Mixtures of survey data collection modes and less burdensome survey designs are additional steps taken by survey researchers to address nonresponse.
Journal Article