Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Series TitleSeries Title
-
Reading LevelReading Level
-
YearFrom:-To:
-
More FiltersMore FiltersContent TypeItem TypeIs Full-Text AvailableSubjectCountry Of PublicationPublisherSourceTarget AudienceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
33,372
result(s) for
"linear regression"
Sort by:
BEST SUBSET SELECTION VIA A MODERN OPTIMIZATION LENS
2016
In the period 1991-2015, algorithmic advances in Mixed Integer Optimization (MIO) coupled with hardware improvements have resulted in an astonishing 450 billion factor speedup in solving MIO problems. We present a MIO approach for solving the classical best subset selection problem of choosing k out of p features in linear regression given n observations. We develop a discrete extension of modern first-order continuous optimization methods to find high quality feasible solutions that we use as warm starts to a MIO solver that finds provably optimal solutions. The resulting algorithm (a) provides a solution with a guarantee on its suboptimality even if we terminate the algorithm early, (b) can accommodate side constraints on the coefficients of the linear regression and (c) extends to finding best subset solutions for the least absolute deviation loss function. Using a wide variety of synthetic and real datasets, we demonstrate that our approach solves problems with n in the 1000s and p in the 100s in minutes to provable optimality, and finds near optimal solutions for n in the 100s and p in the 1000s in minutes. We also establish via numerical experiments that the MIO approach performs better than Lasso and other popularly used sparse learning procedures, in terms of achieving sparse solutions with good predictive power.
Journal Article
Applications of Gene Expression Programming and Regression Techniques for Estimating Compressive Strength of Bagasse Ash based Concrete
by
Javed, Muhammad Faisal
,
Alyousef, Rayed
,
Khan, Kaffayatullah
in
Artificial intelligence
,
Ashes
,
Bagasse
2020
Compressive strength is one of the important property of concrete and depends on many factors. Most of the concrete compressive strength predictive models mainly rely on available literature data, which are too simple to consider all the contributing factors. This study adopted a new approach to predict the compressive strength of sugarcane bagasse ash concrete (SCBAC). A vast amount of data from the literature study and fifteen laboratory tested concrete samples with different dosage of bagasse ash, were respectively used to calibrate and validate the models. The novel Gene Expression Programming, Multiple Linear Regression and Multiple Non-Linear Regression were used to model SCBAC compressive strength. The water cement ratio, bagasse ash percent replacement, quantity of fine and coarse aggregate and cement content were used as an input for models development. Various statistical indicators, i.e., NSE, R2 and RMSE were used to assess the performance of the models. The results indicated a strong correlation between observed and predicted values with NSE and R2 both above 0.8 during calibration and validation for the Gene Expression Programming (GEP). The outcomes from GEP outclassed all the models to predict SCBAC compressive strength. The validity of the model is further verified using data of fifteen tests conducted in the laboratory. Moreover, the cement content in the mix was revealed as the most sensitive parameter followed by water cement ratio form sensitivity analysis. The GEP fulfilled all the criteria for external validity. The simple formulae derived in this study could be used reliably for the prediction of SCBAC compressive strength.
Journal Article
Estimating Above-Ground Biomass of Maize Using Features Derived from UAV-Based RGB Imagery
by
Zhang, Liyuan
,
Peng, Xingshuo
,
Niu, Yaxiao
in
Agricultural management
,
Agricultural production
,
Agriculture
2019
The rapid, accurate, and economical estimation of crop above-ground biomass at the farm scale is crucial for precision agricultural management. The unmanned aerial vehicle (UAV) remote-sensing system has a great application potential with the ability to obtain remote-sensing imagery with high temporal-spatial resolution. To verify the application potential of consumer-grade UAV RGB imagery in estimating maize above-ground biomass, vegetation indices and plant height derived from UAV RGB imagery were adopted. To obtain a more accurate observation, plant height was directly derived from UAV RGB point clouds. To search the optimal estimation method, the estimation performances of the models based on vegetation indices alone, based on plant height alone, and based on both vegetation indices and plant height were compared. The results showed that plant height directly derived from UAV RGB point clouds had a high correlation with ground-truth data with an R2 value of 0.90 and an RMSE value of 0.12 m. The above-ground biomass exponential regression models based on plant height alone had higher correlations for both fresh and dry above-ground biomass with R2 values of 0.77 and 0.76, respectively, compared to the linear regression model (both R2 values were 0.59). The vegetation indices derived from UAV RGB imagery had great potential to estimate maize above-ground biomass with R2 values ranging from 0.63 to 0.73. When estimating the above-ground biomass of maize by using multivariable linear regression based on vegetation indices, a higher correlation was obtained with an R2 value of 0.82. There was no significant improvement of the estimation performance when plant height derived from UAV RGB imagery was added into the multivariable linear regression model based on vegetation indices. When estimating crop above-ground biomass based on UAV RGB remote-sensing system alone, looking for optimized vegetation indices and establishing estimation models with high performance based on advanced algorithms (e.g., machine learning technology) may be a better way.
Journal Article
Optimal Bandwidth Choice for the Regression Discontinuity Estimator
2012
We investigate the choice of the bandwidth for the regression discontinuity estimator. We focus on estimation by local linear regression, which was shown to have attractive properties (Porter, J. 2003, \"Estimation in the Regression Discontinuity Model\" (unpublished, Department of Economics, University of Wisconsin, Madison)). We derive the asymptotically optimal bandwidth under squared error loss. This optimal bandwidth depends on unknown functionals of the distribution of the data and we propose simple and consistent estimators for these functionals to obtain a fully data-driven bandwidth algorithm. We show that this bandwidth estimator is optimal according to the criterion of Li (1987, \"Asymptotic Optimality for C p , C L , Cross-validation and Generalized Cross-validation: Discrete Index Set\", Annals of Statistics, 15, 958–975), although it is not unique in the sense that alternative consistent estimators for the unknown functionals would lead to bandwidth estimators with the same optimality properties. We illustrate the proposed bandwidth, and the sensitivity to the choices made in our algorithm, by applying the methods to a data set previously analysed by Lee (2008, \"Randomized Experiments from Non-random Selection in U.S. House Elections\", Journal of Econometrics, 142, 675–697) as well as by conducting a small simulation study.
Journal Article
Linear Regression Machine Learning Algorithms for Estimating Reference Evapotranspiration Using Limited Climate Data
2022
A linear regression machine learning model to estimate the reference evapotranspiration based on temperature data for South Korea is developed in this study. FAO56 Penman–Monteith (FAO56 P–M) reference evapotranspiration calculated with meteorological data (1981–2021) obtained from sixty-two meteorological stations nationwide is used as the label. All study datasets provide daily, monthly, or annual values based on the average temperature, daily temperature difference, and extraterrestrial radiation. Multiple linear regression (MLR) and polynomial regression (PR) are applied as machine learning algorithms, and twelve models are tested using the training data. The results of the performance evaluation of the period from 2017 to 2021 show that the polynomial regression algorithm that learns the amount of extraterrestrial radiation achieves the best performance (the minimum root-mean-square errors of 0.72 mm/day, 11.3 mm/month, and 40.5 mm/year for daily, monthly, and annual scale, respectively). Compared to temperature-based empirical equations, such as Hargreaves, Blaney–Criddle, and Thornthwaite, the model trained using the polynomial regression algorithm achieves the highest coefficient of determination and lowest error with the reference evapotranspiration of the FAO56 Penman–Monteith equation when using all meteorological data. Thus, the proposed method is more effective than the empirical equations under the condition of insufficient meteorological data when estimating reference evapotranspiration.
Journal Article