Catalogue Search | MBRL

Confidence intervals in generalized regression models

by Uusipaikka, Esa I in Regression analysis. , Confidence intervals. , Linear models (Statistics)

Book

Share this book

Add to My Shelf

BEST SUBSET SELECTION VIA A MODERN OPTIMIZATION LENS

by Bertsimas, Dimitris , Mazumder, Rahul , King, Angela in 62G35 , 62J05 , 62J07

2016

In the period 1991-2015, algorithmic advances in Mixed Integer Optimization (MIO) coupled with hardware improvements have resulted in an astonishing 450 billion factor speedup in solving MIO problems. We present a MIO approach for solving the classical best subset selection problem of choosing k out of p features in linear regression given n observations. We develop a discrete extension of modern first-order continuous optimization methods to find high quality feasible solutions that we use as warm starts to a MIO solver that finds provably optimal solutions. The resulting algorithm (a) provides a solution with a guarantee on its suboptimality even if we terminate the algorithm early, (b) can accommodate side constraints on the coefficients of the linear regression and (c) extends to finding best subset solutions for the least absolute deviation loss function. Using a wide variety of synthetic and real datasets, we demonstrate that our approach solves problems with n in the 1000s and p in the 100s in minutes to provable optimality, and finds near optimal solutions for n in the 100s and p in the 1000s in minutes. We also establish via numerical experiments that the MIO approach performs better than Lasso and other popularly used sparse learning procedures, in terms of achieving sparse solutions with good predictive power.

Journal Article

Share this book

Add to My Shelf

The coordinate-free approach to linear models

by Wichura, Michael J. (Michael John) author in Linear models (Statistics) , Analysis of variance , Regression analysis

2006

Book

Share this book

Add to My Shelf

Applications of Gene Expression Programming and Regression Techniques for Estimating Compressive Strength of Bagasse Ash based Concrete

by Javed, Muhammad Faisal , Alyousef, Rayed , Khan, Kaffayatullah in Artificial intelligence , Ashes , Bagasse

2020

Compressive strength is one of the important property of concrete and depends on many factors. Most of the concrete compressive strength predictive models mainly rely on available literature data, which are too simple to consider all the contributing factors. This study adopted a new approach to predict the compressive strength of sugarcane bagasse ash concrete (SCBAC). A vast amount of data from the literature study and fifteen laboratory tested concrete samples with different dosage of bagasse ash, were respectively used to calibrate and validate the models. The novel Gene Expression Programming, Multiple Linear Regression and Multiple Non-Linear Regression were used to model SCBAC compressive strength. The water cement ratio, bagasse ash percent replacement, quantity of fine and coarse aggregate and cement content were used as an input for models development. Various statistical indicators, i.e., NSE, R2 and RMSE were used to assess the performance of the models. The results indicated a strong correlation between observed and predicted values with NSE and R2 both above 0.8 during calibration and validation for the Gene Expression Programming (GEP). The outcomes from GEP outclassed all the models to predict SCBAC compressive strength. The validity of the model is further verified using data of fifteen tests conducted in the laboratory. Moreover, the cement content in the mix was revealed as the most sensitive parameter followed by water cement ratio form sensitivity analysis. The GEP fulfilled all the criteria for external validity. The simple formulae derived in this study could be used reliably for the prediction of SCBAC compressive strength.

Journal Article

Share this book

Add to My Shelf

Applied linear statistical models

by Nater, John author , Kutner, Micheal H. author , Nachtsheim, Chris author in Regression analysis , Analysis of variance , Experimental design

1996

Book

Share this book

Add to My Shelf

Estimating Above-Ground Biomass of Maize Using Features Derived from UAV-Based RGB Imagery

by Zhang, Liyuan , Peng, Xingshuo , Niu, Yaxiao in Agricultural management , Agricultural production , Agriculture

2019

The rapid, accurate, and economical estimation of crop above-ground biomass at the farm scale is crucial for precision agricultural management. The unmanned aerial vehicle (UAV) remote-sensing system has a great application potential with the ability to obtain remote-sensing imagery with high temporal-spatial resolution. To verify the application potential of consumer-grade UAV RGB imagery in estimating maize above-ground biomass, vegetation indices and plant height derived from UAV RGB imagery were adopted. To obtain a more accurate observation, plant height was directly derived from UAV RGB point clouds. To search the optimal estimation method, the estimation performances of the models based on vegetation indices alone, based on plant height alone, and based on both vegetation indices and plant height were compared. The results showed that plant height directly derived from UAV RGB point clouds had a high correlation with ground-truth data with an R2 value of 0.90 and an RMSE value of 0.12 m. The above-ground biomass exponential regression models based on plant height alone had higher correlations for both fresh and dry above-ground biomass with R2 values of 0.77 and 0.76, respectively, compared to the linear regression model (both R2 values were 0.59). The vegetation indices derived from UAV RGB imagery had great potential to estimate maize above-ground biomass with R2 values ranging from 0.63 to 0.73. When estimating the above-ground biomass of maize by using multivariable linear regression based on vegetation indices, a higher correlation was obtained with an R2 value of 0.82. There was no significant improvement of the estimation performance when plant height derived from UAV RGB imagery was added into the multivariable linear regression model based on vegetation indices. When estimating crop above-ground biomass based on UAV RGB remote-sensing system alone, looking for optimized vegetation indices and establishing estimation models with high performance based on advanced algorithms (e.g., machine learning technology) may be a better way.

Journal Article

Share this book

Add to My Shelf

Interaction effects in linear and generalized linear models : examples and applications using Stata

by Kaufman, Robert L., author in Stata. , Regression analysis. , Regression analysis Data processing.

Book

Share this book

Add to My Shelf

Optimal Bandwidth Choice for the Regression Discontinuity Estimator

by IMBENS, GUIDO , KALYANARAMAN, KARTHIK in Algorithms , Approximation , Asymptotic methods

2012

We investigate the choice of the bandwidth for the regression discontinuity estimator. We focus on estimation by local linear regression, which was shown to have attractive properties (Porter, J. 2003, \"Estimation in the Regression Discontinuity Model\" (unpublished, Department of Economics, University of Wisconsin, Madison)). We derive the asymptotically optimal bandwidth under squared error loss. This optimal bandwidth depends on unknown functionals of the distribution of the data and we propose simple and consistent estimators for these functionals to obtain a fully data-driven bandwidth algorithm. We show that this bandwidth estimator is optimal according to the criterion of Li (1987, \"Asymptotic Optimality for C p , C L , Cross-validation and Generalized Cross-validation: Discrete Index Set\", Annals of Statistics, 15, 958–975), although it is not unique in the sense that alternative consistent estimators for the unknown functionals would lead to bandwidth estimators with the same optimality properties. We illustrate the proposed bandwidth, and the sensitivity to the choices made in our algorithm, by applying the methods to a data set previously analysed by Lee (2008, \"Randomized Experiments from Non-random Selection in U.S. House Elections\", Journal of Econometrics, 142, 675–697) as well as by conducting a small simulation study.

Journal Article

Share this book

Add to My Shelf

Regression analysis and linear models : concepts, applications, and implementation

by Darlington, Richard B., author , Hayes, Andrew F in Regression analysis. , Linear models (Statistics) , Psychology Statistical methods.

Book

Share this book

Add to My Shelf

Linear Regression Machine Learning Algorithms for Estimating Reference Evapotranspiration Using Limited Climate Data

by Kim, Soo-Jin , Jang, Min-Won , Bae, Seung-Jong in Accuracy , Algorithms , Climate

2022

A linear regression machine learning model to estimate the reference evapotranspiration based on temperature data for South Korea is developed in this study. FAO56 Penman–Monteith (FAO56 P–M) reference evapotranspiration calculated with meteorological data (1981–2021) obtained from sixty-two meteorological stations nationwide is used as the label. All study datasets provide daily, monthly, or annual values based on the average temperature, daily temperature difference, and extraterrestrial radiation. Multiple linear regression (MLR) and polynomial regression (PR) are applied as machine learning algorithms, and twelve models are tested using the training data. The results of the performance evaluation of the period from 2017 to 2021 show that the polynomial regression algorithm that learns the amount of extraterrestrial radiation achieves the best performance (the minimum root-mean-square errors of 0.72 mm/day, 11.3 mm/month, and 40.5 mm/year for daily, monthly, and annual scale, respectively). Compared to temperature-based empirical equations, such as Hargreaves, Blaney–Criddle, and Thornthwaite, the model trained using the polynomial regression algorithm achieves the highest coefficient of determination and lowest error with the reference evapotranspiration of the FAO56 Penman–Monteith equation when using all meteorological data. Thus, the proposed method is more effective than the empirical equations under the condition of insufficient meteorological data when estimating reference evapotranspiration.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter