Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
131,888 result(s) for "Confidence interval"
Sort by:
EXACT POST-SELECTION INFERENCE, WITH APPLICATION TO THE LASSO
We develop a general approach to valid inference after model selection. At the core of our framework is a result that characterizes the distribution of a post-selection estimator conditioned on the selection event. We specialize the approach to model selection by the lasso to form valid confidence intervals for the selected coefficients and test whether all relevant variables have been included in the model.
Estimation methods for the variance of Birnbaum-Saunders distribution containing zero values with application to wind speed data in Thailand
Thailand is currently grappling with a severe problem of air pollution, especially from small particulate matter (PM), which poses considerable threats to public health. The speed of the wind is pivotal in spreading these harmful particles across the atmosphere. Given the inherently unpredictable wind speed behavior, our focus lies in establishing the confidence interval (CI) for the variance of wind speed data. To achieve this, we will employ the delta-Birnbaum-Saunders (delta-BirSau) distribution. This statistical model allows for analyzing wind speed data and offers valuable insights into its variability and potential implications for air quality. The intervals are derived from ten different methods: generalized confidence interval (GCI), bootstrap confidence interval (BCI), generalized fiducial confidence interval (GFCI), and normal approximation (NA). Specifically, we apply GCI, BCI, and GFCI while considering the estimation of the proportion of zeros using the variance stabilized transformation (VST), Wilson, and Hannig methods. To evaluate the performance of these methods, we conduct a simulation study using Monte Carlo simulations in the R statistical software. The study assesses the coverage probabilities and average widths of the proposed confidence intervals. The simulation results reveal that GFCI based on the Wilson method is optimal for small sample sizes, GFCI based on the Hannig method excels for medium sample sizes, and GFCI based on the VST method stands out for large sample sizes. To further validate the practical application of these methods, we employ daily wind speed data from an industrial area in Prachin Buri and Rayong provinces, Thailand.
The Bayesian New Statistics: Hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective
In the practice of data analysis, there is a conceptual distinction between hypothesis testing, on the one hand, and estimation with quantified uncertainty on the other. Among frequentists in psychology, a shift of emphasis from hypothesis testing to estimation has been dubbed “the New Statistics” (Cumming 2014 ). A second conceptual distinction is between frequentist methods and Bayesian methods. Our main goal in this article is to explain how Bayesian methods achieve the goals of the New Statistics better than frequentist methods. The article reviews frequentist and Bayesian approaches to hypothesis testing and to estimation with confidence or credible intervals. The article also describes Bayesian approaches to meta-analysis, randomized controlled trials, and power analysis.
Twelve-Year Analysis of NO2 Concentration Measurements at Belisario Station (Quito, Ecuador) Using Statistical Inference Techniques
In this paper, a robust analysis of nitrogen dioxide (NO2) concentration measurements taken at Belisario station (Quito, Ecuador) was performed. The data used for the analysis constitute a set of measurements taken from 1 January 2008 to 31 December 2019. Furthermore, the analysis was carried out in a robust way, defining variables that represent years, months, days and hours, and classifying these variables based on estimates of the central tendency and dispersion of the data. The estimators used here were classic, nonparametric, based on a bootstrap method, and robust. Additionally, confidence intervals based on these estimators were built, and these intervals were used to categorize the variables under study. The results of this research showed that the NO2 concentration at Belisario station is not harmful to humans. Moreover, it was shown that this concentration tends to be stable across the years, changes slightly during the days of the week, and varies greatly when analyzed by months and hours of the day. Here, the precision provided by both nonparametric and robust statistical methods served to comprehensively proof the aforementioned. Finally, it can be concluded that the city of Quito is progressing on the right path in terms of improving air quality, because it has been shown that there is a decreasing tendency in the NO2 concentration across the years. In addition, according to the Quito Air Quality Index, most of the observations are in either the desirable level or acceptable level of air pollution, and the number of observations that are in the desirable level of air pollution increases across the years.
Exact Post-Selection Inference for Sequential Regression Procedures
We propose new inference tools for forward stepwise regression, least angle regression, and the lasso. Assuming a Gaussian model for the observation vector y, we first describe a general scheme to perform valid inference after any selection event that can be characterized as y falling into a polyhedral set. This framework allows us to derive conditional (post-selection) hypothesis tests at any step of forward stepwise or least angle regression, or any step along the lasso regularization path, because, as it turns out, selection events for these procedures can be expressed as polyhedral constraints on y. The p-values associated with these tests are exactly uniform under the null distribution, in finite samples, yielding exact Type I error control. The tests can also be inverted to produce confidence intervals for appropriate underlying regression parameters. The R package selectiveInference , freely available on the CRAN repository, implements the new inference tools described in this article. Supplementary materials for this article are available online.
Confidence intervals for low dimensional parameters in high dimensional linear models
The purpose of this paper is to propose methodologies for statistical inference of low dimensional parameters with high dimensional data. We focus on constructing confidence intervals for individual coefficients and linear combinations of several of them in a linear regression model, although our ideas are applicable in a much broader context. The theoretical results that are presented provide sufficient conditions for the asymptotic normality of the proposed estimators along with a consistent estimator for their finite dimensional covariance matrices. These sufficient conditions allow the number of variables to exceed the sample size and the presence of many small non‐zero coefficients. Our methods and theory apply to interval estimation of a preconceived regression coefficient or contrast as well as simultaneous interval estimation of many regression coefficients. Moreover, the method proposed turns the regression data into an approximate Gaussian sequence of point estimators of individual regression coefficients, which can be used to select variables after proper thresholding. The simulation results that are presented demonstrate the accuracy of the coverage probability of the confidence intervals proposed as well as other desirable properties, strongly supporting the theoretical results.
Cronbach's alpha reliability: Interval estimation, hypothesis testing, and sample size planning
Cronbach’s alpha is one of the most widely used measures of reliability in the social and organizational sciences. Current practice is to report the sample value of Cronbach’s alpha reliability, but a confidence interval for the population reliability value also should be reported. The traditional confidence interval for the population value of Cronbach’s alpha makes an unnecessarily restrictive assumption that the multiple measurements have equal variances and equal covariances. We propose a confidence interval that does not require equal variances or equal covariances. The results of a simulation study demonstrated that the proposed method performed better than alternative methods. We also present some sample size formulas that approximate the sample size requirements for desired power or desired confidence interval precision. R functions are provided that can be used to implement the proposed confidence interval and sample size methods.
High-Dimensional Inference: Confidence Intervals, p-Values and R-Software hdi
We present a (selective) review of recent frequentist high-dimensional inference methods for constructing p-values and confidence intervals in linear and generalized linear models. We include a broad, comparative empirical study which complements the viewpoint from statistical methodology and theory. Furthermore, we introduce and illustrate the R-package hdi which easily allows the use of different methods and supports reproducibility.
Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
Many scientific and engineering challenges-ranging from personalized medicine to customized marketing recommendations-require an understanding of treatment effect heterogeneity. In this article, we develop a nonparametric causal forest for estimating heterogeneous treatment effects that extends Breiman's widely used random forest algorithm. In the potential outcomes framework with unconfoundedness, we show that causal forests are pointwise consistent for the true treatment effect and have an asymptotically Gaussian and centered sampling distribution. We also discuss a practical method for constructing asymptotic confidence intervals for the true treatment effect that are centered at the causal forest estimates. Our theoretical results rely on a generic Gaussian theory for a large family of random forest algorithms. To our knowledge, this is the first set of results that allows any type of random forest, including classification and regression forests, to be used for provably valid statistical inference. In experiments, we find causal forests to be substantially more powerful than classical methods based on nearest-neighbor matching, especially in the presence of irrelevant covariates.
SEMI-SUPERVISED INFERENCE
We propose a general semi-supervised inference framework focused on the estimation of the population mean. As usual in semi-supervised settings, there exists an unlabeled sample of covariate vectors and a labeled sample consisting of covariate vectors along with real-valued responses (“labels”). Otherwise, the formulation is “assumption-lean” in that no major conditions are imposed on the statistical or functional form of the data. We consider both the ideal semi-supervised setting where infinitely many unlabeled samples are available, as well as the ordinary semi-supervised setting in which only a finite number of unlabeled samples is available. Estimators are proposed along with corresponding confidence intervals for the population mean. Theoretical analysis on both the asymptotic distribution and ℓ₂-risk for the proposed procedures are given. Surprisingly, the proposed estimators, based on a simple form of the least squares method, outperform the ordinary sample mean. The simple, transparent form of the estimator lends confidence to the perception that its asymptotic improvement over the ordinary sample mean also nearly holds even for moderate size samples. The method is further extended to a nonparametric setting, in which the oracle rate can be achieved asymptotically. The proposed estimators are further illustrated by simulation studies and a real data example involving estimation of the homeless population.