Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
1,017
result(s) for
"Simultaneous inference"
Sort by:
ONLY CLOSED TESTING PROCEDURES ARE ADMISSIBLE FOR CONTROLLING FALSE DISCOVERY PROPORTIONS
2021
We consider the class of all multiple testing methods controlling tail probabilities of the false discovery proportion, either for one random set or simultaneously for many such sets. This class encompasses methods controlling familywise error rate, generalized familywise error rate, false discovery exceedance, joint error rate, simultaneous control of all false discovery proportions, and others, as well as gene set testing in genomics and cluster inference in neuroimaging. We show that all such methods are either equivalent to a closed testing procedure, or are uniformly improved by one. Moreover, we show that a closed testing method is admissible if and only if all its local tests are admissible. This implies that, when designing methods, it is sufficient to restrict attention to closed testing. We demonstrate the practical usefulness of this design principle by obtaining more informative inferences from the method of higher criticism, and by constructing a uniform improvement of a recently proposed method.
Journal Article
Simultaneous Inference for High-Dimensional Linear Models
2017
This article proposes a bootstrap-assisted procedure to conduct simultaneous inference for high-dimensional sparse linear models based on the recent desparsifying Lasso estimator. Our procedure allows the dimension of the parameter vector of interest to be exponentially larger than sample size, and it automatically accounts for the dependence within the desparsifying Lasso estimator. Moreover, our simultaneous testing method can be naturally coupled with the margin screening to enhance its power in sparse testing with a reduced computational cost, or with the step-down method to provide a strong control for the family-wise error rate. In theory, we prove that our simultaneous testing procedure asymptotically achieves the prespecified significance level, and enjoys certain optimality in terms of its power even when the model errors are non-Gaussian. Our general theory is also useful in studying the support recovery problem. To broaden the applicability, we further extend our main results to generalized linear models with convex loss functions. The effectiveness of our methods is demonstrated via simulation studies. Supplementary materials for this article are available online.
Journal Article
GAUSSIAN APPROXIMATION FOR HIGH DIMENSIONAL TIME SERIES
2017
We consider the problem of approximating sums of high dimensional stationary time series by Gaussian vectors, using the framework of functional dependence measure. The validity of the Gaussian approximation depends on the sample size n, the dimension p, the moment condition and the dependence of the underlying processes. We also consider an estimator for long-run covariance matrices and study its convergence properties. Our results allow constructing simultaneous confidence intervals for mean vectors of high-dimensional time series with asymptotically correct coverage probabilities. As an application, we propose a Kolmogorov–Smirnov-type statistic for testing distributions of high-dimensional time series.
Journal Article
LASSO-DRIVEN INFERENCE IN TIME AND SPACE
by
Huang, Chen
,
Wang, Weining
,
Chernozhukov, Victor
in
Bootstrap method
,
Estimating techniques
,
Inference
2021
We consider the estimation and inference in a system of high-dimensional regression equations allowing for temporal and cross-sectional dependency in covariates and error processes, covering rather general forms of weak temporal dependence. A sequence of regressions with many regressors using LASSO (Least Absolute Shrinkage and Selection Operator) is applied for variable selection purpose, and an overall penalty level is carefully chosen by a block multiplier bootstrap procedure to account for multiplicity of the equations and dependencies in the data. Correspondingly, oracle properties with a jointly selected tuning parameter are derived. We further provide high-quality de-biased simultaneous inference on the many target parameters of the system. We provide bootstrap consistency results of the test procedure, which are based on a general Bahadur representation for the Z-estimators with dependent data. Simulations demonstrate good performance of the proposed inference procedure. Finally, we apply the method to quantify spillover effects of textual sentiment indices in a financial market and to test the connectedness among sectors.
Journal Article
Extremal Depth for Functional Data and Applications
2016
We propose a new notion called \"extremal depth\" (ED) for functional data, discuss its properties, and compare its performance with existing concepts. The proposed notion is based on a measure of extreme \"outlyingness.\" ED has several desirable properties that are not shared by other notions and is especially well suited for obtaining central regions of functional data and function spaces. In particular: (a) the central region achieves the nominal (desired) simultaneous coverage probability; (b) there is a correspondence between ED-based (simultaneous) central regions and appropriate pointwise central regions; and (c) the method is resistant to certain classes of functional outliers. The article examines the performance of ED and compares it with other depth notions. Its usefulness is demonstrated through applications to constructing central regions, functional boxplots, outlier detection, and simultaneous confidence bands in regression problems. Supplementary materials for this article are available online.
Journal Article
general framework for multiple testing dependence
2008
We develop a general framework for performing large-scale significance testing in the presence of arbitrarily strong dependence. We derive a low-dimensional set of random vectors, called a dependence kernel, that fully captures the dependence structure in an observed high-dimensional dataset. This result shows a surprising reversal of the \"curse of dimensionality\" in the high-dimensional hypothesis testing setting. We show theoretically that conditioning on a dependence kernel is sufficient to render statistical tests independent regardless of the level of dependence in the observed data. This framework for multiple testing dependence has implications in a variety of common multiple testing problems, such as in gene expression studies, brain imaging, and spatial epidemiology.
Journal Article
VALID POST-SELECTION INFERENCE IN MODEL-FREE LINEAR REGRESSION
by
Buja, Andreas
,
Kuchibhotla, Arun K.
,
Brown, Lawrence D.
in
Asymptotic properties
,
Computational efficiency
,
Confidence
2020
Modern data-driven approaches to modeling make extensive use of covariate/ model selection. Such selection incurs a cost: it invalidates classical statistical inference. A conservative remedy to the problem was proposed by Berk et al. (Ann. Statist. 41 (2013) 802–837) and further extended by Bachoc, Preinerstorfer and Steinberger (2016). These proposals, labeled “PoSI methods,” provide valid inference after arbitrary model selection. They are computationally NP-hard and have limitations in their theoretical justifications. We therefore propose computationally efficient confidence regions, named “UPoSI”¹ and prove large-p asymptotics for them. We do this for linear OLS regression allowing misspecification of the normal linear model, for both fixed and random covariates, and for independent as well as some types of dependent data. We start by proving a general equivalence result for the post-selection inference problem and a simultaneous inference problem in a setting that strips inessential features still present in a related result of Berk et al. (Ann. Statist. 41 (2013) 802–837). We then construct valid PoSI confidence regions that are the first to have vastly improved computational efficiency in that the required computation times grow only quadratically rather than exponentially with the total number p of covariates. These are also the first PoSI confidence regions with guaranteed asymptotic validity when the total number of covariates p diverges (almost exponentially) with the sample size n. Under standard tail assumptions, we only require (log p)⁷ = o(n) and
k
=
o
(
n
/
log
p
)
where k (≤ p) is the largest number of covariates (model size) considered for selection. We study various properties of these confidence regions, including their Lebesgue measures, and compare them theoretically with those proposed previously.
Journal Article
Simultaneous confidence bands for multiple comparisons of several percentile lines
2025
In practice, it is often necessary to compare several percentile lines. To that end, a set of simultaneous confidence bands has been constructed. The contributions of this research are as follows: (1) the proposed bands are constructed and used to multiple comparisons of several percentile lines for the first time; (2) they allow to draw various comparisons: pairwise, successive and many-to-one; and (3) the comparisons can be drawn on any intervals of interest, and provide more information on both the magnitude and the direction of difference. In addition, practical applications are presented.
Journal Article
Functional Data Analysis for Sparse Longitudinal Data
by
Wang, Jane-Ling
,
Yao, Fang
,
Müller, Hans-Georg
in
Acquired immune deficiency syndrome
,
AIDS
,
Analysis
2005
We propose a nonparametric method to perform functional principal components analysis for the case of sparse longitudinal data. The method aims at irregularly spaced longitudinal data, where the number of repeated measurements available per subject is small. In contrast, classical functional data analysis requires a large number of regularly spaced measurements per subject. We assume that the repeated measurements are located randomly with a random number of repetitions for each subject and are determined by an underlying smooth random (subject-specific) trajectory plus measurement errors. Basic elements of our approach are the parsimonious estimation of the covariance structure and mean function of the trajectories, and the estimation of the variance of the measurement errors. The eigenfunction basis is estimated from the data, and functional principal components score estimates are obtained by a conditioning step. This conditional estimation method is conceptually simple and straightforward to implement. A key step is the derivation of asymptotic consistency and distribution results under mild conditions, using tools from functional analysis. Functional data analysis for sparse longitudinal data enables prediction of individual smooth trajectories even if only one or few measurements are available for a subject. Asymptotic pointwise and simultaneous confidence bands are obtained for predicted individual trajectories, based on asymptotic distributions, for simultaneous bands under the assumption of a finite number of components. Model selection techniques, such as the Akaike information criterion, are used to choose the model dimension corresponding to the number of eigenfunctions in the model. The methods are illustrated with a simulation study, longitudinal CD4 data for a sample of AIDS patients, and time-course gene expression data for the yeast cell cycle.
Journal Article
A Bayesian credible subgroups approach to identifying patient subgroups with positive treatment effects
by
Offen, Walter W.
,
Schnell, Patrick M.
,
Carlin, Bradley P.
in
Alzheimer disease
,
Alzheimer Disease - drug therapy
,
Alzheimer's disease
2016
Many new experimental treatments benefit only a subset of the population. Identifying the baseline covariate profiles of patients who benefit from such a treatment, rather than determining whether or not the treatment has a populationlevel effect, can substantially lessen the risk in undertaking a clinical trial and expose fewer patients to treatments that do not benefit them. The standard analyses for identifying patient subgroups that benefit from an experimental treatment either do not account for multiplicity, or focus on testing for the presence of treatment-covariate interactions rather than the resulting individualized treatment effects. We propose a Bayesian credible subgroups method to identify two bounding subgroups for the benefiting subgroup: one for which it is likely that all members simultaneously have a treatment effect exceeding a specified threshold, and another for which it is likely that no members do. We examine frequentist properties of the credible subgroups method via simulations and illustrate the approach using data from an Alzheimer's disease treatment trial. We conclude with a discussion of the advantages and limitations of this approach to identifying patients for whom the treatment is beneficial.
Journal Article