Catalogue Search | MBRL

Data-Based Sensing and Stochastic Analysis of Biodiesel Production Process

by Khattak, Mansoor Khan , Ibrahim, Uzair , Ahmad, Iftikhar in Alternative energy sources , biodiesel , Biodiesel fuels

2019

Biodiesel production is a field of outstanding prospects due to the renewable nature of its feedstock and little to no overall CO2 emissions to the environment. Data-based soft sensors are used in realizing stable and efficient operation of biodiesel production. However, the conventional data-based soft sensors cannot grasp the effect of process uncertainty on the process outcomes. In this study, a framework of data-based soft sensors was developed using ensemble learning method, i.e., boosting, for prediction of composition, quantity, and quality of product, i.e., fatty acid methyl esters (FAME), in biodiesel production process from vegetable oil. The ensemble learning method was integrated with the polynomial chaos expansion (PCE) method to quantify the effect of uncertainties in process variables on the target outcomes. The proposed modeling framework is highly accurate in prediction of the target outcomes and quantification of the effect of process uncertainty.

Journal Article

Share this book

Add to My Shelf

A boosting ensemble learning based hybrid light gradient boosting machine and extreme gradient boosting model for predicting house prices

by Sibindi, Racheal , Waititu, Anthony Gichuhi , Mwangi, Ronald Waweru in Accuracy , Algorithms , boosting ensemble learning

2023

The implementation of tree‐ensemble models has become increasingly essential in solving classification and prediction problems. Boosting ensemble techniques have been widely used as individual machine learning algorithms in predicting house prices. One of the techniques is LGBM algorithm that employs leaf wise growth strategy, reduces loss and improves accuracy during training which results in overfitting. However, XGBoost algorithm uses level wise growth strategy which takes time to compute resulting in higher computation time. Nevertheless, XGBoost has a regularization parameter, implements column sampling and weight reduction on new trees which combats overfitting. This study focuses on developing a hybrid LGBM and XGBoost model in order to prevent overfitting through minimizing variance whilst improving accuracy. Bayesian hyperparameter optimization technique is implemented on the base learners in order to find the best combination of hyperparameters. This resulted in reduced variance (overfitting) in the hybrid model since the regularization parameter values were optimized. The hybrid model is compared to LGBM, XGBoost, Adaboost and GBM algorithms to evaluate its performance in giving accurate house price predictions using MSE, MAE and MAPE evaluation metrics. The hybrid LGBM and XGBoost model outperformed the other models with MSE, MAE and MAPE of 0.193, 0.285, and 0.156 respectively. The article proposes an integration of advanced ML algorithms, LGBM and XGBoost techniques in predicting house prices. The proposed model is compared to individual boosting ensemble learning algorithms to evaluate its performance. The hybrid LGBM and XGBoost model has better performance accuracy results in predicting house prices compared to the individual models.

Journal Article

Share this book

Add to My Shelf

Boosting Ensemble Learning for Freeway Crash Classification under Varying Traffic Conditions: A Hyperparameter Optimization Approach

by Nasayreh, Ahmad , Almahdi, Abdulla , Aljohani, Abeer in Classification , Fatalities , Mathematical optimization

2023

Freeway crashes represent a significant and persistent threat to road safety, resulting in both loss of life and extensive property damage. Effectively addressing this critical issue requires a comprehensive understanding of the factors contributing to these incidents and the ability to accurately predict crash severity under different traffic conditions. This study aims to improve the accuracy of crash classification by incorporating key traffic-related variables such as braking, weather conditions, and speed. To validate the effectiveness of proposed model, we utilize real-world crash data from Flint, Michigan. To achieve the objective, we employ an innovative Boosting Ensemble Learning approach, leveraging five advanced ensemble learning models: Gradient Boosting, Cat Boost, XGBoost, LightGBM, and SGD. Through the application of hyperparameter optimization techniques, we further enhance the performance of these models, improving their predictive capabilities. Our evaluation results demonstrated the effectiveness of our approach, with Gradient Boosting algorithms achieving an accuracy rate of up to 96% in crash classification. This research provides valuable insights into the potential of using Boosting Ensemble Learning as a tool for accurately and efficiently classifying freeway crashes across a spectrum of traffic conditions. Additionally, it sheds light on the nuanced variations in crash mechanisms observed when employing diverse ensemble learning models. The findings of this study underscore the significance of hyperparameter optimization as a critical factor in elevating the predictive precision of freeway crashes.

Journal Article

Share this book

Add to My Shelf

Data-Based Sensing and Stochastic Analysis of Biodiesel Production Process

by Mansoor Khan Khattak , Ahsan Ayub , Manabu Kano in biodiesel , biodiesel; machine learning; ensemble learning; boosting; uncertainty analysis; polynomial chaos expansion , boosting

2018

Journal Article

Share this book

Add to My Shelf

Boosting and bagging classification for computer science journal

by Nafalski, Andrew , Ar Rasyid, Harits , Wibawa, Aji Prasetya in ensemble learning, boosting, bagging, decision tree, gaussian naive bayes, scimago journal rank

2023

In recent years, data processing has become an issue across all disciplines. Good data processing can provide decision-making recommendations. Data processing is covered in academic data processing publications, including those in computer science. This topic has grown over the past three years, demonstrating that data processing is expanding and diversifying, and there is a great deal of interest in this area of study. Within the journal, groupings (quartiles) indicate the journal's influence on other similar studies. SCImago provides this category. There are four quartiles, with the highest quartile being 1 and the lowest being 4. There are, however, numerous differences in class quartiles, with different quartile values for the same journal in different disciplines. Therefore, a method of categorization is provided to solve this issue. Classification is a machine-learning technique that groups data based on the supplied label class. Ensemble Boosting and Bagging with Decision Tree (DT) and Gaussian Nave Bayes (GNB) were utilized in this study. Several modifications were made to the ensemble algorithm's depth and estimator settings to examine the influence of adding values on the resultant precision. In the DT algorithm, both variables are altered, whereas, in the GNB algorithm, just the estimator's value is modified. Based on the average value of the accuracy results, it is known that the best algorithm for computer science datasets is GNB Bagging, with values of 68.96%, 70.99%, and 69.05%. Second-place XGBDT has 67.75% accuracy, 67.69% precision, and 67.83 recall. The DT Bagging method placed third with 67.31 percent recall, 68.13 percent precision, and 67.30 percent accuracy. The fourth sequence is the XGBoost GNB approach, which has an accuracy of 67.07%, a precision of 68.85%, and a recall of 67.18%. The Adaboost DT technique ranks in the fifth position with an accuracy of 63.65%, a precision of 64.21 %, and a recall of 63.63 %. Adaboost GNB is the least efficient algorithm for this dataset since it only achieves 43.19 % accuracy, 48.14 % precision, and 43.2% recall. The results are still quite far from the ideal. Hence the proposed method for journal quartile inequality issues is not advised.

Journal Article

Share this book

Add to My Shelf

Price Prediction for Fresh Agricultural Products Based on a Boosting Ensemble Algorithm

by Zhang, Shuai , Ma, Huanhuan , An, Qi in Accuracy , Agricultural commodities , Agricultural equipment

2025

The time series of agricultural prices exhibit brevity and considerable volatility. Considering that traditional time series models and machine learning models are facing challenges in making predictions with high accuracy and robustness, this paper proposes a Light gradient boosting machine model based on the boosting ensemble learning algorithm to predict prices for three representative types of fresh agricultural products (bananas, beef, crucian carp). The prediction performance of the Light gradient boosting machine model is evaluated by comparing it against multiple benchmark models (ARIMA, decision tree, random forest, support vector machine, XGBoost, and artificial neural network) in terms of accuracy, generalizability, and robustness on different datasets and under different time windows. Among these models, the Light gradient boosting machine model is shown to have the highest prediction accuracy and the most stable performance across three different datasets under both long-term and short-term time windows. As the time window length increases, the Light gradient boosting machine model becomes more advantageous for effectively reducing error fluctuation, demonstrating better robustness. Consequently, the model proposed in this paper holds significant potential for forecasting fresh agricultural product prices, thereby facilitating the advancement of precision and sustainable farming practices.

Journal Article

Share this book

Add to My Shelf

Detecting refactoring type of software commit messages based on ensemble machine learning algorithms

by Sbaih, Nour , Al-Fraihat, Dimah , Al-Ghuwairi, Abdel-Rahman in 639/705/117 , 639/705/794 , Accuracy

2024

Refactoring is a well-established topic in contemporary software engineering, focusing on enhancing software's structural design without altering its external behavior. Commit messages play a vital role in tracking changes to the codebase. However, determining the exact refactoring required in the code can be challenging due to various refactoring types. Prior studies have attempted to classify refactoring documentation by type, achieving acceptable results in accuracy, precision, recall, F1-Score, and other performance metrics. Nevertheless, there is room for improvement. To address this, we propose a novel approach using four ensemble Machine Learning algorithms to detect refactoring types. Our experimentation utilized a dataset containing 573 commits, with text cleaning and preprocessing applied to address data imbalances. Various techniques, including hyperparameter optimization, feature engineering with TF-IDF and bag-of-words, and binary transformation using one-vs-one and one-vs-rest classifiers, were employed to enhance accuracy. Results indicate that the experiment involving feature engineering using the TF-IDF technique outperformed other methods. Notably, the XGBoost algorithm with the same technique achieved superior performance across all metrics, attaining 100% accuracy. Moreover, our results surpass the current state-of-the-art performance using the same dataset. Our proposed approach bears significant implications for software engineering, particularly in enhancing the internal quality of software.

Journal Article

Share this book

Add to My Shelf

Investigating factors influencing injury severity in crashes involving vulnerable road users in Pakistan

by Junaid, Muhammad , Alotaibi, Saleh , Jiang, Chaozhe in 639/166 , 639/166/986 , Access control

2025

Road traffic crashes claim around 1.19 million lives annually worldwide, with over half of the fatalities involving vulnerable road users (VRUs). While several studies have explored the risk factors associated with specific categories of VRUs in Pakistan, research focusing on VRUs collectively, considering all categories and their unique safety challenges, remains limited. This study aims to examine the influence of various risk factors on the severity of injuries resulting from crashes involving VRUs, using a three-year dataset (2021–2023). The study evaluated the effectiveness of six boosting-based ensemble machine learning classifiers across multiple evaluation metrics. The findings indicated that boosting with decision stumps outperformed extreme gradient boosting, light gradient boosting, histogram-based gradient boosting, categorical boosting, and adaptive boosting in terms of recall, F 1 -score, and accuracy. The partial dependence plots demonstrated that VRUs aged 55 years or older, collisions with other VRU groups, involvement of vans and heavy vehicles, rainy weather, the COVID-19 period, and the existence of painted medians increase the likelihood of severe injury in crashes involving VRUs. The pairwise SHAP interaction plot also supported these findings by illustrating that the interaction between different vehicle types (vans and heavy vehicles), adverse weather conditions, and VRU crashes during the COVID-19 lockdown period elevates the risk of severe crashes. Based on the study findings, several policy recommendations were proposed, including implementing education and awareness programs, developing strategies to manage mixed traffic, and improving road infrastructure to enhance safety for all VRU groups.

Journal Article

Share this book

Add to My Shelf

Boosting ensembles and deep vision networks optimized by forensic-based investigation algorithm for financial distress prediction in construction firms

by Pham, Nguyen-Ngan-Hanh , Chou, Jui-Sheng in Accuracy , Algorithms , Artificial intelligence

2025

Effective risk management is crucial in the construction industry, which has a substantial economic impact but is vulnerable to high financial risks due to volatile material costs and complex project-based financial structures. This study presents a new hybrid model to improve the prediction of financial distress for Taiwanese-listed construction companies. The research compares four boosting-based ensemble learning models, advanced deep learning models, and improved ensemble models that incorporate a novel approach using the Multi-Criteria Decision-Making (MCDM) technique, the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS), to enhance feature selection. Experimental results show that while TOPSIS-eXtreme Gradient Boosting (TOPSIS-XGBoost) is highly effective at managing imbalanced financial datasets, Light Gradient Boosting Machine (LightGBM) performs better in balanced environments. Both models exhibit substantial performance gains when integrated with the Forensic-Based Investigation (FBI) optimization algorithm, resulting in the optimized hybrids—FBI-TOPSIS-XGBoost and FBI-LightGBM—which achieve marked improvements in predictive accuracy. These optimized models consistently outperform benchmark approaches, including the Altman Z-score, Zmijewski X-score, Logistic Regression, and Random Forest, across multiple evaluation metrics. To enhance transparency and interpretability, a global SHapley Additive exPlanations (SHAP) analysis was conducted, revealing that profitability and per-share index indicators are the primary determinants driving model predictions. Additionally, an expert system interface has been developed to enhance the practical usability of these models. These findings strengthen the methodological foundation for predicting financial distress and provide stakeholders with valuable tools for mitigating risk in Taiwan’s construction industry.

Journal Article

Share this book

Add to My Shelf

Feature Selection for Ensemble Learning and Its Application

by Li, Guo‐Zheng , Yang, Jack Y. in ensemble learning such as bagging and boosting ‐ for single learning machines , feature selection for ensemble learning and its application , multitask learning

2008

This chapter contains sections titled: Introduction Feature Selection for Individuals of Bagging Selective Ensemble Learning Multitask Learning The Brain Glioma Case Summary Acknowledgments References

Book Chapter

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter