Catalogue Search | MBRL
نتائج البحث
MBRLSearchResults
وجه الفتاة! هناك خطأ ما.
أثناء محاولة إضافة العنوان إلى الرف ، حدث خطأ ما :( يرجى إعادة المحاولة لاحقًا!
-
الضبطالضبط
-
مُحَكَّمةمُحَكَّمة
-
نوع العنصرنوع العنصر
-
الموضوعالموضوع
-
السنةمن:-إلى:
-
المزيد من المرشحاتالمزيد من المرشحاتالمصدراللغة
منجز
مرشحات
إعادة تعيين
1,436
نتائج ل
"Inference after model selection"
صنف حسب:
Inference in High-Dimensional Panel Models With an Application to Gun Control
بواسطة
Belloni, Alexandre
,
Kozbur, Damian
,
Chernozhukov, Victor
في
Clustered standard errors
,
Crime prevention
,
Firearm laws & regulations
2016
We consider estimation and inference in panel data models with additive unobserved individual specific heterogeneity in a high-dimensional setting. The setting allows the number of time-varying regressors to be larger than the sample size. To make informative estimation and inference feasible, we require that the overall contribution of the time-varying variables after eliminating the individual specific heterogeneity can be captured by a relatively small number of the available variables whose identities are unknown. This restriction allows the problem of estimation to proceed as a variable selection problem. Importantly, we treat the individual specific heterogeneity as fixed effects which allows this heterogeneity to be related to the observed time-varying variables in an unspecified way and allows that this heterogeneity may differ for all individuals. Within this framework, we provide procedures that give uniformly valid inference over a fixed subset of parameters in the canonical linear fixed effects model and over coefficients on a fixed vector of endogenous variables in panel data instrumental variable models with fixed effects and many instruments. We present simulation results in support of the theoretical developments and illustrate the use of the methods in an application aimed at estimating the effect of gun prevalence on crime rates.
Journal Article
PROGRAM EVALUATION AND CAUSAL INFERENCE WITH HIGH-DIMENSIONAL DATA
2017
In this paper, we provide efficient estimators and honest confidence bands for a variety of treatment effects including local average (LATE) and local quantile treatment effects (LQTE) in data-rich environments. We can handle very many control variables, endogenous receipt of treatment, heterogeneous treatment effects, and function-valued outcomes. Our framework covers the special case of exogenous receipt of treatment, either conditional on controls or unconditionally as in randomized control trials. In the latter case, our approach produces efficient estimators and honest bands for (functional) average treatment effects (ATE) and quantile treatment effects (QTE). To make informative inference possible, we assume that key reduced-form predictive relationships are approximately sparse. This assumption allows the use of regularization and selection methods to estimate those relations, and we provide methods for postregularization and post-selection inference that are uniformly valid (honest) across a wide range of models. We show that a key ingredient enabling honest inference is the use of orthogonal or doubly robust moment conditions in estimating certain reducedform functional parameters. We illustrate the use of the proposed methods with an application to estimating the effect of 401(k) eligibility and participation on accumulated assets. The results on program evaluation are obtained as a consequence of more general results on honest inference in a general moment-condition framework, which arises from structural equation models in econometrics. Here, too, the crucial ingredient is the use of orthogonal moment conditions, which can be constructed from the initial moment conditions. We provide results on honest inference for (function-valued) parameters within this general framework where any high-quality, machine learning methods (e.g., boosted trees, deep neural networks, random forest, and their aggregated and hybrid versions) can be used to learn the nonparametric/high-dimensional components of the model. These include a number of supporting auxiliary results that are of major independent interest: namely, we (1) prove uniform validity of a multiplier bootstrap, (2) offer a uniformly valid functional delta method, and (3) provide results for sparsitybased estimation of regression functions for function-valued outcomes.
Journal Article
UNIFORMLY VALID POST-REGULARIZATION CONFIDENCE REGIONS FOR MANY FUNCTIONAL PARAMETERS IN Z-ESTIMATION FRAMEWORK
بواسطة
Wei, Ying
,
Belloni, Alexandre
,
Chetverikov, Denis
في
Band theory
,
Computer simulation
,
Data analysis
2018
In this paper, we develop procedures to construct simultaneous confidence bands for ˜p potentially infinite-dimensional parameters after model selection for general moment condition models where ˜p is potentially much larger than the sample size of available data, n. This allows us to cover settings with functional response data where each of the ˜p parameters is a function. The procedure is based on the construction of score functions that satisfy Neyman orthogonality condition approximately. The proposed simultaneous confidence bands rely on uniform central limit theorems for high-dimensional vectors (and not on Donsker arguments as we allow for ˜p ≫ n). To construct the bands, we employ a multiplier bootstrap procedure which is computationally efficient as it only involves resampling the estimated score functions (and does not require resolving the high-dimensional optimization problems). We formally apply the general theory to inference on regression coefficient process in the distribution regression model with a logistic link, where two implementations are analyzed in detail. Simulations and an application to real data are provided to help illustrate the applicability of the results.
Journal Article
Can One Estimate the Conditional Distribution of Post-Model-Selection Estimators?
2006
We consider the problem of estimating the conditional distribution of a post-model-selection estimator where the conditioning is on the selected model. The notion of a post-model-selection estimator here refers to the combined procedure resulting from first selecting a model (e.g., by a model selection criterion such as AIC or by a hypothesis testing procedure) and then estimating the parameters in the selected model (e.g., by least-squares or maximum likelihood), all based on the same data set. We show that it is impossible to estimate this distribution with reasonable accuracy even asymptotically. In particular, we show that no estimator for this distribution can be uniformly consistent (not even locally). This follows as a corollary to (local) minimax lower bounds on the performance of estimators for this distribution. Similar impossibility results are also obtained for the conditional distribution of linear functions (e.g., predictors) of the post-model-selection estimator.
Journal Article
Post-model selection inference and model averaging
2011
Although model selection is routinely used in practice nowadays, little is known about its precise eects on any subsequent inference that is carried out. The same goes for the eects induced by the closely related technique of model averaging. This paper is concerned with the use of the same data rst to select a model and then to carry out inference, in particular point estimation and point prediction. The properties of the resulting estimator, called a post-model-selection estimator (PMSE), are hard to derive. Using selection criteria such as hypothesis testing, AIC, BIC, HQ and Cp, we illustrate that, in terms of risk function, no single PMSE dominates the others. The same conclusion holds more generally for any penalised likelihood information criterion. We also compare various model averaging schemes and show that no single one dominates the others in terms of risk function. Since PMSEs can be regarded as a special case of model averaging, with 0-1 random-weights, we propose a connection between the two theories, in the frequentist approach, by taking account of the selection procedure when performing model averaging. We illustrate the point by simulating a simple linear regression model.
Journal Article
Variance Estimation in a Model With Gaussian Submodels
2005
This article considers the problem of estimating the dispersion parameter in a Gaussian model that is intermediate between a model where the mean parameter is fully known (fixed) and a model where the mean parameter is completely unknown. One of the goals is to understand the implications of the two-step process of first selecting a model among a finite number of submodels, then estimating a parameter of interest after the model selection, but using the same sample data. The estimators are classified into global, two-step, and weighted estimators. Whereas the global-type estimators ignore the model space structure, the two-step estimators explore the structure adaptively and can be related to pretest estimators, and the weighted estimators are motivated by the Bayesian approach. Their performances are compared theoretically and through simulations using their risk functions based on a scale-invariant quadratic loss function. It is shown that in the variance estimation problem, efficiency gains arise by exploiting the submodel structure through the use of two-step and weighted estimators, especially when the number of competing submodels is few, but that this advantage may deteriorate or be lost altogether for some two-step estimators as the number of submodels increases or the distance between them decreases. Furthermore, it is demonstrated that weighted estimators, arising from properly chosen priors, outperform two-step estimators when there are many competing submodels or when the submodels are close to each other, whereas two-step estimators are preferred when the submodels are highly distinguishable. The results have implications for model averaging and model selection issues.
Journal Article
ROCKET
بواسطة
Kolar, Mladen
,
Barber, Rina Foygel
في
Complex variables
,
Computer simulation
,
Confidence intervals
2018
Understanding complex relationships between random variables is of fundamental importance in high-dimensional statistics, with numerous applications in biological and social sciences. Undirected graphical models are often used to represent dependencies between random variables, where an edge between two random variables is drawn if they are conditionally dependent given all the other measured variables. A large body of literature exists on methods that estimate the structure of an undirected graphical model, however, little is known about the distributional properties of the estimators beyond the Gaussian setting. In this paper, we focus on inference for edge parameters in a high-dimensional transelliptical model, which generalizes Gaussian and nonparanormal graphical models. We propose ROCKET, a novel procedure for estimating parameters in the latent inverse covariance matrix. We establish asymptotic normality of ROCKET in an ultra high-dimensional setting under mild assumptions, without relying on oracle model selection results. ROCKET requires the same number of samples that are known to be necessary for obtaining a √n consistent estimator of an element in the precision matrix under a Gaussian model. Hence, it is an optimal estimator under a much larger family of distributions. The result hinges on a tight control of the sparse spectral norm of the nonparametric Kendall’s tau estimator of the correlation matrix, which is of independent interest. Empirically, ROCKET outperforms the nonparanormal and Gaussian models in terms of achieving accurate inference on simulated data. We also compare the three methods on real data (daily stock returns), and find that the ROCKET estimator is the only method whose behavior across subsamples agrees with the distribution predicted by the theory.
Journal Article
SPARSE MODELS AND METHODS FOR OPTIMAL INSTRUMENTS WITH AN APPLICATION TO EMINENT DOMAIN
2012
We develop results for the use of Lasso and post-Lasso methods to form first-stage predictions and estimate optimal instruments in linear instrumental variables (IV) models with many instruments, p. Our results apply even when p is much larger than the sample size, n. We show that the IV estimator based on using Lasso or post-Lasso in the first stage is root-n consistent and asymptotically normal when the first stage is approximately sparse, that is, when the conditional expectation of the endogenous variables given the instruments can be well-approximated by a relatively small set of variables whose identities may be unknown. We also show that the estimator is semiparametrically efficient when the structural error is homoscedastic. Notably, our results allow for imperfect model selection, and do not rely upon the unrealistic \"beta-min\" conditions that are widely used to establish validity of inference following model selection (see also Belloni, Chernozhukov, and Hansen (2011b)). In simulation experiments, the Lasso-based IV estimator with a data-driven penalty performs well compared to recently advocated many-instrument robust procedures. In an empirical example dealing with the effect of judicial eminent domain decisions on economic outcomes, the Lasso-based IV estimator outperforms an intuitive benchmark. Optimal instruments are conditional expectations. In developing the IV results, we establish a series of new results for Lasso and post-Lasso estimators of nonparametric conditional expectation functions which are of independent theoretical and practical interest. We construct a modification of Lasso designed to deal with non-Gaussian, heteroscedastic disturbances that uses a data-weighted 𝓁₁-penalty function. By innovatively using moderate deviation theory for self-normalized sums, we provide convergence rates for the resulting Lasso and post-Lasso estimators that are as sharp as the corresponding rates in the homoscedastic Gaussian case under the condition that log p = o(n 1/3 ). We also provide a data-driven method for choosing the penalty level that must be specified in obtaining Lasso and post-Lasso estimates and establish its asymptotic validity under non-Gaussian, heteroscedastic disturbances.
Journal Article
VALID CONFIDENCE INTERVALS FOR POST-MODEL-SELECTION PREDICTORS
بواسطة
Pötscher, Benedikt M.
,
Bachoc, François
,
Leeb, Hannes
في
Confidence intervals
,
Interval arithmetic
,
Mathematical models
2019
We consider inference post-model-selection in linear regression. In this setting, Berk et al. [Ann. Statist. 41 (2013a) 802–837] recently introduced a class of confidence sets, the so-called PoSI intervals, that cover a certain non-standard quantity of interest with a user-specified minimal coverage probability, irrespective of the model selection procedure that is being used. In this paper, we generalize the PoSI intervals to confidence intervals for post-model-selection predictors.
Journal Article
A nondegenerate Vuong test and post selection confidence intervals for semi/nonparametric models
2020
This paper proposes a new model selection test for the statistical comparison of semi/non-parametric models based on a general quasi-likelihood ratio criterion. An important feature of the new test is its uniformly exact asymptotic size in the overlapping nonnested case, as well as in the easier nested and strictly nonnested cases. The uniform size control is achieved without using pretesting, sample-splitting, or simulated critical values. We also show that the test has nontrivial power against all vn-local alternatives and against some local alternatives that converge to the null faster than vn. Finally, we provide a framework for conducting uniformly valid post model selection inference for model parameters. The finite sample performance of the nondegenerate test and that of the post model selection inference procedure are illustrated in a mean-regression example by Monte Carlo.
Journal Article