Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
96 result(s) for "Sandwich estimators"
Sort by:
Models as Approximations I
In the early 1980s, Halbert White inaugurated a \"model-robust\" form of statistical inference based on the \"sandwich estimator\" of standard error. This estimator is known to be \"heteroskedasticity-consistent,\" but it is less well known to be \"nonlinearity-consistent\" as well. Nonlinearity, however, raises fundamental issues because in its presence regressors are not ancillary, hence cannot be treated as fixed. The consequences are deep: (1) population slopes need to be reinterpreted as statistical functionals obtained from OLS fits to largely arbitrary joint 𝑥-𝑦 distributions; (2) the meaning of slope parameters needs to be rethought; (3) the regressor distribution affects the slope parameters; (4) randomness of the regressors becomes a source of sampling variability in slope estimates of order 1/√𝑁; (5) inference needs to be based on model-robust standard errors, including sandwich estimators or the 𝑥-𝑦 bootstrap. In theory, model-robust and model-trusting standard errors can deviate by arbitrary magnitudes either way. In practice, significant deviations between them can be detected with a diagnostic test.
AGNOSTIC NOTES ON REGRESSION ADJUSTMENTS TO EXPERIMENTAL DATA: REEXAMINING FREEDMAN'S CRITIQUE
Freedman [Adv. in Appl. Math. 40 (2008) 180—193; Ann. Appl. Stat. 2 (2008) 176—196] critiqued ordinary least squares regression adjustment of estimated treatment effects in randomized experiments, using Neyman's model for randomization inference. Contrary to conventional wisdom, he argued that adjustment can lead to worsened asymptotic precision, invalid measures of precision, and small-sample bias. This paper shows that in sufficiently large samples, those problems are either minor or easily fixed. OLS adjustment cannot hurt asymptotic precision when a full set of treatment—covariate interactions is included. Asymptotically valid confidence intervals can be constructed with the Huber—White sandwich standard error estimator. Checks on the asymptotic approximations are illustrated with data from Angrist, Lang, and Oreopoulos's [Am. Econ. J.: Appl. Econ. 1:1 (2009) 136—163] evaluation of strategies to improve college students' achievement. The strongest reasons to support Freedman's preference for unadjusted estimates are transparency and the dangers of specification search.
Models as Approximations II
We develop a model-free theory of general types of parametric regression for i.i.d. observations. The theory replaces the parameters of parametric models with statistical functionals, to be called \"regression functionals,\" defined on large nonparametric classes of joint 𝑥-𝑦 distributions, without assuming a correct model. Parametric models are reduced to heuristics to suggest plausible objective functions. An example of a regression functional is the vector of slopes of linear equations fitted by OLS to largely arbitrary 𝑥-𝑦 distributions, without assuming a linear model (see Part I). More generally, regression functionals can be defined by minimizing objective functions, solving estimating equations, or with ad hoc constructions. In this framework, it is possible to achieve the following: (1) define a notion of \"well-specification\" for regression functionals that replaces the notion of correct specification of models, (2) propose a well-specification diagnostic for regression functionals based on reweighting distributions and data, (3) decompose sampling variability of regression functionals into two sources, one due to the conditional response distribution and another due to the regressor distribution interacting with misspecification, both of order 𝑁−1/2, (4) exhibit plug-in/sandwich estimators of standard error as limit cases of 𝑥-𝑦 bootstrap estimators, and (5) provide theoretical heuristics to indicate that 𝑥-𝑦 bootstrap standard errors may generally be preferred over sandwich estimators.
Comparison of Two Bias-Corrected Covariance Estimators for Generalized Estimating Equations
Mancl and DeRouen (2001, Biometrics57, 126-134) and Kauermann and Carroll (2001, JASA96, 1387-1398) proposed alternative bias-corrected covariance estimators for generalized estimating equations parameter estimates of regression models for marginal means. The finite sample properties of these estimators are compared to those of the uncorrected sandwich estimator that underestimates variances in small samples. Although the formula of Mancl and DeRouen generally overestimates variances, it often leads to coverage of 95% confidence intervals near the nominal level even in some situations with as few as 10 clusters. An explanation for these seemingly contradictory results is that the tendency to undercoverage resulting from the substantial variability of sandwich estimators counteracts the impact of overcorrecting the bias. However, these positive results do not generally hold; for small cluster sizes (e.g., <10) their estimator often results in overcoverage, and the bias-corrected covariance estimator of Kauermann and Carroll may be preferred. The methods are illustrated using data from a nested cross-sectional cluster intervention trial on reducing underage drinking.
Extension of the modified Poisson regression model to prospective studies with correlated binary data
The Poisson regression model using a sandwich variance estimator has become a viable alternative to the logistic regression model for the analysis of prospective studies with independent binary outcomes. The primary advantage of this approach is that it readily provides covariate-adjusted risk ratios and associated standard errors. In this article, the model is extended to studies with correlated binary outcomes as arise in longitudinal or cluster randomization studies. The key step involves a cluster-level grouping strategy for the computation of the middle term in the sandwich estimator. For a single binary exposure variable without covariate adjustment, this approach results in risk ratio estimates and standard errors that are identical to those found in the survey sampling literature. Simulation results suggest that it is reliable for studies with correlated binary data, provided the total number of clusters is at least 50. Data from observational and cluster randomized studies are used to illustrate the methods.
Sample Size Determination for GEE Analyses of Stepped Wedge Cluster Randomized Trials
In stepped wedge cluster randomized trials, intact clusters of individuals switch from control to intervention from a randomly assigned period onwards. Such trials are becoming increasingly popular in health services research. When a closed cohort is recruited from each cluster for longitudinal follow-up, proper sample size calculation should account for three distinct types of intraclass correlations: the within-period, the inter-period, and the within-individual correlations. Setting the latter two correlation parameters to be equal accommodates cross-sectional designs. We propose sample size procedures for continuous and binary responses within the framework of generalized estimating equations that employ a block exchangeable within-cluster correlation structure defined from the distinct correlation types. For continuous responses, we show that the intraclass correlations affect power only through two eigenvalues of the correlation matrix. We demonstrate that analytical power agrees well with simulated power for as few as eight clusters, when data are analyzed using bias-corrected estimating equations for the correlation parameters concurrently with a bias-corrected sandwich variance estimator.
Fast and accurate modelling of longitudinal and repeated measures neuroimaging data
Despite the growing importance of longitudinal data in neuroimaging, the standard analysis methods make restrictive or unrealistic assumptions (e.g., assumption of Compound Symmetry—the state of all equal variances and equal correlations—or spatially homogeneous longitudinal correlations). While some new methods have been proposed to more accurately account for such data, these methods are based on iterative algorithms that are slow and failure-prone. In this article, we propose the use of the Sandwich Estimator method which first estimates the parameters of interest with a simple Ordinary Least Square model and second estimates variances/covariances with the “so-called” Sandwich Estimator (SwE) which accounts for the within-subject correlation existing in longitudinal data. Here, we introduce the SwE method in its classic form, and we review and propose several adjustments to improve its behaviour, specifically in small samples. We use intensive Monte Carlo simulations to compare all considered adjustments and isolate the best combination for neuroimaging data. We also compare the SwE method to other popular methods and demonstrate its strengths and weaknesses. Finally, we analyse a highly unbalanced longitudinal dataset from the Alzheimer's Disease Neuroimaging Initiative and demonstrate the flexibility of the SwE method to fit within- and between-subject effects in a single model. Software implementing this SwE method has been made freely available at http://warwick.ac.uk/tenichols/SwE. •Standard neuroimaging longitudinal methods may lead to invalid results.•The Sandwich Estimator (SwE) method is proposed as an alternative approach.•Adjustments to the standard SwE method are reviewed and proposed.•Monte Carlo simulations are used to isolate a good combination of adjustments.•The SwE method is shown to be a fast, easy to specify and accurate approach.
When the sandwich makes you hesitate, replicate: on sampling variance estimation of multilevel models under complex sample design
Large-scale surveys routinely rely on complex sample designs, necessitating special consideration of sampling variance estimation in multilevel models (MLM). While the sandwich estimator is widely used for this purpose, its implementation, particularly regarding stratification and weighting, remains challenging. Alternatively, the lesser-known replication methods provide a valid alternative; but they are often misunderstood as being only suitable for single-level models and are not widely supported by software packages. This paper clarifies key aspects of implementing both methods under two-level MLM common in large-scale surveys. We provide practical guidance on incorporating sample weights, correctly identifying variance strata for sandwich estimation, and applying replication-based variance estimation in MLM. Two simulation studies evaluate the performance of each method under correct and incorrect specifications, including omission of informative level-1 weights. Results demonstrate that the sandwich estimator and replication methods yield comparable variance estimates when implemented correctly and highlight the consequences of common misapplications. An empirical example using TIMSS 2015 Australia data is used to illustrate these issues in practice. This work contributes to improved methodological soundness in multilevel modeling and calls for expanded software support for replication methods in MLM.
A Note on the Efficiency of Sandwich Covariance Matrix Estimation
The sandwich estimator, also known as robust covariance matrix estimator, heteroscedasticity-consistent covariance matrix estimate, or empirical covariance matrix estimator, has achieved increasing use in the econometric literature as well as with the growing popularity of generalized estimating equations. Its virtue is that it provides consistent estimates of the covariance matrix for parameter estimates even when the fitted parametric model fails to hold or is not even specified. Surprisingly though, there has been little discussion of properties of the sandwich method other than consistency. We investigate the sandwich estimator in quasi-likelihood models asymptotically, and in the linear case analytically. We show that under certain circumstances when the quasi-likelihood model is correct, the sandwich estimate is often far more variable than the usual parametric variance estimate. The increased variance is a fixed feature of the method and the price that one pays to obtain consistency even when the parametric model fails or when there is heteroscedasticity. We show that the additional variability directly affects the coverage probability of confidence intervals constructed from sandwich variance estimates. In fact, the use of sandwich variance estimates combined with t-distribution quantiles gives confidence intervals with coverage probability falling below the nominal value. We propose an adjustment to compensate for this fact.
Multilevel modelling of complex survey data
Multilevel modelling is sometimes used for data from complex surveys involving multistage sampling, unequal sampling probabilities and stratification. We consider generalized linear mixed models and particularly the case of dichotomous responses. A pseudolikelihood approach for accommodating inverse probability weights in multilevel models with an arbitrary number of levels is implemented by using adaptive quadrature. A sandwich estimator is used to obtain standard errors that account for stratification and clustering. When level 1 weights are used that vary between elementary units in clusters, the scaling of the weights becomes important. We point out that not only variance components but also regression coefficients can be severely biased when the response is dichotomous. The pseudolikelihood methodology is applied to complex survey data on reading proficiency from the American sample of the 'Program for international student assessment' 2000 study, using the Stata program gllamm which can estimate a wide range of multilevel and latent variable models. Performance of pseudo-maximum-likelihood with different methods for handling level 1 weights is investigated in a Monte Carlo experiment. Pseudo-maximum-likelihood estimators of (conditional) regression coefficients perform well for large cluster sizes but are biased for small cluster sizes. In contrast, estimators of marginal effects perform well in both situations. We conclude that caution must be exercised in pseudo-maximum-likelihood estimation for small cluster sizes when level 1 weights are used.