Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
93
result(s) for
"sandwich estimators"
Sort by:
Models as Approximations I
by
Pitkin, Emil
,
Brown, Lawrence
,
Zhao, Linda
in
Approximation
,
Diagnostic systems
,
Estimating techniques
2019
In the early 1980s, Halbert White inaugurated a \"model-robust\" form of statistical inference based on the \"sandwich estimator\" of standard error. This estimator is known to be \"heteroskedasticity-consistent,\" but it is less well known to be \"nonlinearity-consistent\" as well. Nonlinearity, however, raises fundamental issues because in its presence regressors are not ancillary, hence cannot be treated as fixed. The consequences are deep: (1) population slopes need to be reinterpreted as statistical functionals obtained from OLS fits to largely arbitrary joint 𝑥-𝑦 distributions; (2) the meaning of slope parameters needs to be rethought; (3) the regressor distribution affects the slope parameters; (4) randomness of the regressors becomes a source of sampling variability in slope estimates of order 1/√𝑁; (5) inference needs to be based on model-robust standard errors, including sandwich estimators or the 𝑥-𝑦 bootstrap. In theory, model-robust and model-trusting standard errors can deviate by arbitrary magnitudes either way. In practice, significant deviations between them can be detected with a diagnostic test.
Journal Article
AGNOSTIC NOTES ON REGRESSION ADJUSTMENTS TO EXPERIMENTAL DATA: REEXAMINING FREEDMAN'S CRITIQUE
Freedman [Adv. in Appl. Math. 40 (2008) 180—193; Ann. Appl. Stat. 2 (2008) 176—196] critiqued ordinary least squares regression adjustment of estimated treatment effects in randomized experiments, using Neyman's model for randomization inference. Contrary to conventional wisdom, he argued that adjustment can lead to worsened asymptotic precision, invalid measures of precision, and small-sample bias. This paper shows that in sufficiently large samples, those problems are either minor or easily fixed. OLS adjustment cannot hurt asymptotic precision when a full set of treatment—covariate interactions is included. Asymptotically valid confidence intervals can be constructed with the Huber—White sandwich standard error estimator. Checks on the asymptotic approximations are illustrated with data from Angrist, Lang, and Oreopoulos's [Am. Econ. J.: Appl. Econ. 1:1 (2009) 136—163] evaluation of strategies to improve college students' achievement. The strongest reasons to support Freedman's preference for unadjusted estimates are transparency and the dangers of specification search.
Journal Article
Models as Approximations II
by
Buja, Andreas
,
Brown, Lawrence
,
Zhao, Linda
in
Approximation
,
Diagnostic systems
,
Estimating techniques
2019
We develop a model-free theory of general types of parametric regression for i.i.d. observations. The theory replaces the parameters of parametric models with statistical functionals, to be called \"regression functionals,\" defined on large nonparametric classes of joint 𝑥-𝑦 distributions, without assuming a correct model. Parametric models are reduced to heuristics to suggest plausible objective functions. An example of a regression functional is the vector of slopes of linear equations fitted by OLS to largely arbitrary 𝑥-𝑦 distributions, without assuming a linear model (see Part I). More generally, regression functionals can be defined by minimizing objective functions, solving estimating equations, or with ad hoc constructions. In this framework, it is possible to achieve the following: (1) define a notion of \"well-specification\" for regression functionals that replaces the notion of correct specification of models, (2) propose a well-specification diagnostic for regression functionals based on reweighting distributions and data, (3) decompose sampling variability of regression functionals into two sources, one due to the conditional response distribution and another due to the regressor distribution interacting with misspecification, both of order 𝑁−1/2, (4) exhibit plug-in/sandwich estimators of standard error as limit cases of 𝑥-𝑦 bootstrap estimators, and (5) provide theoretical heuristics to indicate that 𝑥-𝑦 bootstrap standard errors may generally be preferred over sandwich estimators.
Journal Article
Comparison of Two Bias-Corrected Covariance Estimators for Generalized Estimating Equations
by
Bangdiwala, Shrikant I.
,
Qaqish, Bahjat F.
,
Suchindran, Chirayath
in
alcohol drinking
,
Algorithms
,
Analytical estimating
2007
Mancl and DeRouen (2001, Biometrics57, 126-134) and Kauermann and Carroll (2001, JASA96, 1387-1398) proposed alternative bias-corrected covariance estimators for generalized estimating equations parameter estimates of regression models for marginal means. The finite sample properties of these estimators are compared to those of the uncorrected sandwich estimator that underestimates variances in small samples. Although the formula of Mancl and DeRouen generally overestimates variances, it often leads to coverage of 95% confidence intervals near the nominal level even in some situations with as few as 10 clusters. An explanation for these seemingly contradictory results is that the tendency to undercoverage resulting from the substantial variability of sandwich estimators counteracts the impact of overcorrecting the bias. However, these positive results do not generally hold; for small cluster sizes (e.g., <10) their estimator often results in overcoverage, and the bias-corrected covariance estimator of Kauermann and Carroll may be preferred. The methods are illustrated using data from a nested cross-sectional cluster intervention trial on reducing underage drinking.
Journal Article
Modelling approaches for meta‐analyses with dependent effect sizes in ecology and evolution: A simulation study
by
Yang, Yefeng
,
Williams, Coralie
,
Nakagawa, Shinichi
in
cross‐classified data
,
meta‐regression
,
mixed‐effects models
2025
In ecology and evolution, meta‐analysis is an important tool to synthesise findings across separate studies and identify sources of heterogeneity. However, ecological and evolutionary data often exhibit complex dependence structures, such as shared sources of variation within studies, phylogenetic relationships and hierarchical sampling designs. Recent statistical advancements offer approaches for handling such complexities in dependence, yet these methods remain under‐utilised or unfamiliar to ecologists and evolutionary biologists. We conducted extensive simulations to evaluate modelling approaches for handling dependence in effect sizes and sampling errors in ecological and evolutionary meta‐analyses. We assessed the performance of multilevel models, incorporating an assumed sampling error variance–covariance (VCV) matrix (which account for within‐study correlation), cluster robust variance estimation (CRVE) methods and their combination across different true within‐study correlations. Finally, we showcased the applications of these models in two case studies of published meta‐analyses. Multilevel models produced unbiased regression coefficient estimates, and when a sampling VCV matrix was used, it provided accurate random effect variance components estimates within and among studies. However, the latter had no impact on regression coefficient estimates if the model was misspecified. In simulations involving phylogenetic multilevel meta‐analysis, models using CRVE methods generated narrower confidence intervals and lower coverage rates than the nominal expectations. The case study results showed the importance of considering a sampling error VCV matrix to improve the model fit. Our results provide clear modelling recommendations for ecologists and evolutionary biologists conducting meta‐analyses. To improve the precision of variance component estimates, we recommend constructing a VCV matrix that accounts for dependencies in sampling errors within studies. Although CRVE methods provide robust inference under certain conditions, we caution against their use with crossed random effects, such as phylogenetic multilevel meta‐analyses, as CRVE methods currently do not account for multi‐way clustering and may inflate Type I error rates. Finally, we recommend using multilevel meta‐analytic models to account for heterogeneity at all relevant hierarchical levels and to follow guidance on inference methods to ensure accurate coverage of the overall mean.
Journal Article
Extension of the modified Poisson regression model to prospective studies with correlated binary data
2013
The Poisson regression model using a sandwich variance estimator has become a viable alternative to the logistic regression model for the analysis of prospective studies with independent binary outcomes. The primary advantage of this approach is that it readily provides covariate-adjusted risk ratios and associated standard errors. In this article, the model is extended to studies with correlated binary outcomes as arise in longitudinal or cluster randomization studies. The key step involves a cluster-level grouping strategy for the computation of the middle term in the sandwich estimator. For a single binary exposure variable without covariate adjustment, this approach results in risk ratio estimates and standard errors that are identical to those found in the survey sampling literature. Simulation results suggest that it is reliable for studies with correlated binary data, provided the total number of clusters is at least 50. Data from observational and cluster randomized studies are used to illustrate the methods.
Journal Article
Sample Size Determination for GEE Analyses of Stepped Wedge Cluster Randomized Trials
by
Li, Fan
,
Turner, Elizabeth L.
,
Preisser, John S.
in
Bias
,
BIOMETRIC PRACTICE: DISCUSSION PAPER
,
biometry
2018
In stepped wedge cluster randomized trials, intact clusters of individuals switch from control to intervention from a randomly assigned period onwards. Such trials are becoming increasingly popular in health services research. When a closed cohort is recruited from each cluster for longitudinal follow-up, proper sample size calculation should account for three distinct types of intraclass correlations: the within-period, the inter-period, and the within-individual correlations. Setting the latter two correlation parameters to be equal accommodates cross-sectional designs. We propose sample size procedures for continuous and binary responses within the framework of generalized estimating equations that employ a block exchangeable within-cluster correlation structure defined from the distinct correlation types. For continuous responses, we show that the intraclass correlations affect power only through two eigenvalues of the correlation matrix. We demonstrate that analytical power agrees well with simulated power for as few as eight clusters, when data are analyzed using bias-corrected estimating equations for the correlation parameters concurrently with a bias-corrected sandwich variance estimator.
Journal Article
Fast and accurate modelling of longitudinal and repeated measures neuroimaging data
2014
Despite the growing importance of longitudinal data in neuroimaging, the standard analysis methods make restrictive or unrealistic assumptions (e.g., assumption of Compound Symmetry—the state of all equal variances and equal correlations—or spatially homogeneous longitudinal correlations). While some new methods have been proposed to more accurately account for such data, these methods are based on iterative algorithms that are slow and failure-prone. In this article, we propose the use of the Sandwich Estimator method which first estimates the parameters of interest with a simple Ordinary Least Square model and second estimates variances/covariances with the “so-called” Sandwich Estimator (SwE) which accounts for the within-subject correlation existing in longitudinal data. Here, we introduce the SwE method in its classic form, and we review and propose several adjustments to improve its behaviour, specifically in small samples. We use intensive Monte Carlo simulations to compare all considered adjustments and isolate the best combination for neuroimaging data. We also compare the SwE method to other popular methods and demonstrate its strengths and weaknesses. Finally, we analyse a highly unbalanced longitudinal dataset from the Alzheimer's Disease Neuroimaging Initiative and demonstrate the flexibility of the SwE method to fit within- and between-subject effects in a single model. Software implementing this SwE method has been made freely available at http://warwick.ac.uk/tenichols/SwE.
•Standard neuroimaging longitudinal methods may lead to invalid results.•The Sandwich Estimator (SwE) method is proposed as an alternative approach.•Adjustments to the standard SwE method are reviewed and proposed.•Monte Carlo simulations are used to isolate a good combination of adjustments.•The SwE method is shown to be a fast, easy to specify and accurate approach.
Journal Article
A Note on the Efficiency of Sandwich Covariance Matrix Estimation
by
Kauermann, Göran
,
Carroll, Raymond J
in
Analysis of covariance
,
Applications
,
Confidence interval
2001
The sandwich estimator, also known as robust covariance matrix estimator, heteroscedasticity-consistent covariance matrix estimate, or empirical covariance matrix estimator, has achieved increasing use in the econometric literature as well as with the growing popularity of generalized estimating equations. Its virtue is that it provides consistent estimates of the covariance matrix for parameter estimates even when the fitted parametric model fails to hold or is not even specified. Surprisingly though, there has been little discussion of properties of the sandwich method other than consistency. We investigate the sandwich estimator in quasi-likelihood models asymptotically, and in the linear case analytically. We show that under certain circumstances when the quasi-likelihood model is correct, the sandwich estimate is often far more variable than the usual parametric variance estimate. The increased variance is a fixed feature of the method and the price that one pays to obtain consistency even when the parametric model fails or when there is heteroscedasticity. We show that the additional variability directly affects the coverage probability of confidence intervals constructed from sandwich variance estimates. In fact, the use of sandwich variance estimates combined with t-distribution quantiles gives confidence intervals with coverage probability falling below the nominal value. We propose an adjustment to compensate for this fact.
Journal Article
Does It Matter Where Countries Are? Proximity to Knowledge, Markets and Resources, and MNE Location Choices
by
Nachum, Lilach
,
Zaheer, Srilata
,
Gross, Shulamith
in
Applied sciences
,
Business studies
,
Countries
2008
We suggest that the proximity of a country to other countries is a factor that affects its choice as a multinational enterprise (MNE) location. We introduce the concept of a country's proximity to the global distribution of knowledge, markets, and resources, and frame this concept as a function of both geographic distance and the worldwide spatial distribution of these factors. We test our location model on a data set comprising 138,050 investments undertaken by U.S. MNEs worldwide. Our findings show that the proximity of a country to the rest of the world has a positive impact on MNEs choosing that country as a location. Proximity to the world's knowledge and markets are stronger drivers of location choice than is proximity to the world's resources, after accounting for the country's own endowments. Larger firms are able to benefit more from remote locations than smaller firms are.
Journal Article