Catalogue Search | MBRL

Predicting stock market direction in South African banking sector using ensemble machine learning techniques

by Mba, Jules Clement , Mcwera, Angelica in Banks , Banks (Finance) , Computational linguistics

2023

The ability to accurately predict stock price direction is important for investors and policymakers. We aim to predict the direction of daily stock returns for five major South African banks using ensemble machine learning techniques. Financial ratios were used as predictors in single classifier and ensemble models. The key findings were that the support vector machine performed best among single classifiers, with the highest accuracy for 4 banks ranging from 54% to 99% and produces fewer wrong classifications compared to its peer single classifiers. More importantly, the heterogeneous ensemble classifier, combining support vector machines, decision trees and k- (KNN) nearest neighbors, achieved average accuracy rates above 95% and outperformed all other models. This confirms that ensemble methods that combine multiple models can generate more accurate predictions compared to single classifiers. The results suggest that the heterogeneous ensemble is a suitable approach for predicting stock price direction in the South African banking sector. The findings imply that investing in banks may be a good decision and can assist investors. However, further research could expand the models to incorporate macroeconomic and other external factors that influence stock prices. Overall, we demonstrate the value of ensemble learning for a complex forecasting problem. The heterogeneous ensemble approach achieved high accuracy and outperformed single classifiers. However, future research incorporating additional factors and policy implications could build on these findings.

Journal Article

Share this book

Add to My Shelf

Multi-Source Data Fusion Based on Ensemble Learning for Rapid Building Damage Mapping during the 2018 Sulawesi Earthquake and Tsunami in Palu, Indonesia

by Adriano, Bruno , Koshimura, Shunichi , Yokoya, Naoto in 2018 Sulawesi earthquake-tsunami , Algorithms , Annotations

2019

This work presents a detailed analysis of building damage recognition, employing multi-source data fusion and ensemble learning algorithms for rapid damage mapping tasks. A damage classification framework is introduced and tested to categorize the building damage following the recent 2018 Sulawesi earthquake and tsunami. Three robust ensemble learning classifiers were investigated for recognizing building damage from Synthetic Aperture Radar (SAR) and optical remote sensing datasets and their derived features. The contribution of each feature dataset was also explored, considering different combinations of sensors as well as their temporal information. SAR scenes acquired by the ALOS-2 PALSAR-2 and Sentinel-1 sensors were used. The optical Sentinel-2 and PlanetScope sensors were also included in this study. A non-local filter in the preprocessing phase was used to enhance the SAR features. Our results demonstrated that the canonical correlation forests classifier performs better in comparison to the other classifiers. In the data fusion analysis, Digital Elevation Model (DEM)- and SAR-derived features contributed the most in the overall damage classification. Our proposed mapping framework successfully classifies four levels of building damage (with overall accuracy >90%, average accuracy >67%). The proposed framework learned the damage patterns from a limited available human-interpreted building damage annotation and expands this information to map a larger affected area. This process including pre- and post-processing phases were completed in about 3 h after acquiring all raw datasets.

Journal Article

Share this book

Add to My Shelf

Ensemble Classifier Design Based on Perturbation Binary Salp Swarm Algorithm for Classification

by Zhu, Xuhui , He, Qizhi , Ni, Zhiwei in Algorithms , Classification , Ensemble learning

2023

Multiple classifier system exhibits strong classification capacity compared with single classifiers, but they require significant computational resources. Selective ensemble system aims to attain equivalent or better classification accuracy with fewer classifiers. However, current methods fail to identify precise solutions for constructing an ensemble classifier. In this study, we propose an ensemble classifier design technique based on the perturbation binary salp swarm algorithm (ECDPB). Considering that extreme learning machines (ELMs) have rapid learning rates and good generalization ability, they can serve as the basic classifier for creating multiple candidates while using fewer computational resources. Meanwhile, we introduce a combined diversity measure by taking the complementarity and accuracy of ELMs into account; it is used to identify the ELMs that have good diversity and low error. In addition, we propose an ECDPB with powerful optimizing ability; it is employed to find the optimal subset of ELMs. The selected ELMs can then be used to form an ensemble classifier. Experiments on 10 benchmark datasets have been conducted, and the results demonstrate that the proposed ECDPB delivers superior classification capacity when compared with alternative methods.

Journal Article

Share this book

Add to My Shelf

Bagging and Boosting Ensemble Classifiers for Classification of Multispectral, Hyperspectral and PolSAR Data: A Comparative Evaluation

by Jafarzadeh, Hamid , Homayouni, Saeid , Mahdianpari, Masoud in Accuracy , Adaptive algorithms , Algorithms

2021

In recent years, several powerful machine learning (ML) algorithms have been developed for image classification, especially those based on ensemble learning (EL). In particular, Extreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) methods have attracted researchers’ attention in data science due to their superior results compared to other commonly used ML algorithms. Despite their popularity within the computer science community, they have not yet been well examined in detail in the field of Earth Observation (EO) for satellite image classification. As such, this study investigates the capability of different EL algorithms, generally known as bagging and boosting algorithms, including Adaptive Boosting (AdaBoost), Gradient Boosting Machine (GBM), XGBoost, LightGBM, and Random Forest (RF), for the classification of Remote Sensing (RS) data. In particular, different classification scenarios were designed to compare the performance of these algorithms on three different types of RS data, namely high-resolution multispectral, hyperspectral, and Polarimetric Synthetic Aperture Radar (PolSAR) data. Moreover, the Decision Tree (DT) single classifier, as a base classifier, is considered to evaluate the classification’s accuracy. The experimental results demonstrated that the RF and XGBoost methods for the multispectral image, the LightGBM and XGBoost methods for hyperspectral data, and the XGBoost and RF algorithms for PolSAR data produced higher classification accuracies compared to other ML techniques. This demonstrates the great capability of the XGBoost method for the classification of different types of RS data.

Journal Article

Share this book

Add to My Shelf

An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI

by Ahsan, Mominul , Haider, Julfikar , Goni, Md. Omaer Faruq in Accuracy , Algorithms , Artificial intelligence

2022

Diabetes is a chronic disease that continues to be a primary and worldwide health concern since the health of the entire population has been affected by it. Over the years, many academics have attempted to develop a reliable diabetes prediction model using machine learning (ML) algorithms. However, these research investigations have had a minimal impact on clinical practice as the current studies focus mainly on improving the performance of complicated ML models while ignoring their explainability to clinical situations. Therefore, the physicians find it difficult to understand these models and rarely trust them for clinical use. In this study, a carefully constructed, efficient, and interpretable diabetes detection method using an explainable AI has been proposed. The Pima Indian diabetes dataset was used, containing a total of 768 instances where 268 are diabetic, and 500 cases are non-diabetic with several diabetic attributes. Here, six machine learning algorithms (artificial neural network (ANN), random forest (RF), support vector machine (SVM), logistic regression (LR), AdaBoost, XGBoost) have been used along with an ensemble classifier to diagnose the diabetes disease. For each machine learning model, global and local explanations have been produced using the Shapley additive explanations (SHAP), which are represented in different types of graphs to help physicians in understanding the model predictions. The balanced accuracy of the developed weighted ensemble model was 90% with a F1 score of 89% using a five-fold cross-validation (CV). The median values were used for the imputation of the missing values and the synthetic minority oversampling technique (SMOTETomek) was used to balance the classes of the dataset. The proposed approach can improve the clinical understanding of a diabetes diagnosis and help in taking necessary action at the very early stages of the disease.

Journal Article

Share this book

Add to My Shelf

Tweets Classification on the Base of Sentiments for US Airline Companies

by Rustam, Furqan , Ullah, Saleem , Mehmood, Arif in Airlines , Classification , Classifiers

2019

The use of data from social networks such as Twitter has been increased during the last few years to improve political campaigns, quality of products and services, sentiment analysis, etc. Tweets classification based on user sentiments is a collaborative and important task for many organizations. This paper proposes a voting classifier (VC) to help sentiment analysis for such organizations. The VC is based on logistic regression (LR) and stochastic gradient descent classifier (SGDC) and uses a soft voting mechanism to make the final prediction. Tweets were classified into positive, negative and neutral classes based on the sentiments they contain. In addition, a variety of machine learning classifiers were evaluated using accuracy, precision, recall and F1 score as the performance metrics. The impact of feature extraction techniques, including term frequency (TF), term frequency-inverse document frequency (TF-IDF), and word2vec, on classification accuracy was investigated as well. Moreover, the performance of a deep long short-term memory (LSTM) network was analyzed on the selected dataset. The results show that the proposed VC performs better than that of other classifiers. The VC is able to achieve an accuracy of 0.789, and 0.791 with TF and TF-IDF feature extraction, respectively. The results demonstrate that ensemble classifiers achieve higher accuracy than non-ensemble classifiers. Experiments further proved that the performance of machine learning classifiers is better when TF-IDF is used as the feature extraction method. Word2vec feature extraction performs worse than TF and TF-IDF feature extraction. The LSTM achieves a lower accuracy than machine learning classifiers.

Journal Article

Share this book

Add to My Shelf

Customer sentiment analysis for Arabic social media using a novel ensemble machine learning approach

by Hicham, Nouri , Karim, Sabri , Habbat, Nassera

2023

Arabic’s complex morphology, orthography, and dialects make sentiment analysis difficult. This activity makes it harder to extract text attributes from short conversations to evaluate tone. Analyzing and judging a person’s emotional state is complex. Due to these issues, interpreting sentiments accurately and identifying polarity may take much work. Sentiment analysis extracts subjective information from text. This research evaluates machine learning (ML) techniques for understanding Arabic emotions. Sentiment analysis (SA) uses a support vector machine (SVM), Adaboost classifier (AC), maximum entropy (ME), k-nearest neighbors (KNN), decision tree (DT), random forest (RF), logistic regression (LR), and naive Bayes (NB). A model for the ensemble-based sentiment was developed. Ensemble classifiers (ECs) with 10-fold cross-validation out-performed other machine learning classifiers in accuracy (A), specificity (S), precision (P), F1 score (FS), and sensitivity (S).

Journal Article

Share this book

Add to My Shelf

EEG-Based Emotion Classification Using Stacking Ensemble Approach

by Chatterjee, Subhajit , Byun, Yung-Cheol in Accuracy , Brain research , Classification

2022

Rapid advancements in the medical field have drawn much attention to automatic emotion classification from EEG data. People’s emotional states are crucial factors in how they behave and interact physiologically. The diagnosis of patients’ mental disorders is one potential medical use. When feeling well, people work and communicate more effectively. Negative emotions can be detrimental to both physical and mental health. Many earlier studies that investigated the use of the electroencephalogram (EEG) for emotion classification have focused on collecting data from the whole brain because of the rapidly developing science of machine learning. However, researchers cannot understand how various emotional states and EEG traits are related. This work seeks to classify EEG signals’ positive, negative, and neutral emotional states by using a stacking-ensemble-based classification model that boosts accuracy to increase the efficacy of emotion classification using EEG. The selected features are used to train a model that was created using a random forest, light gradient boosting machine, and gradient-boosting-based stacking ensemble classifier (RLGB-SE), where the base classifiers random forest (RF), light gradient boosting machine (LightGBM), and gradient boosting classifier (GBC) were used at level 0. The meta classifier (RF) at level 1 is trained using the results from each base classifier to acquire the final predictions. The suggested ensemble model achieves a greater classification accuracy of 99.55%. Additionally, while comparing performance indices, the suggested technique outperforms as compared with the base classifiers. Comparing the proposed stacking strategy to state-of-the-art techniques, it can be seen that the performance for emotion categorization is promising.

Journal Article

Share this book

Add to My Shelf

A Novel Bearing Multi-Fault Diagnosis Approach Based on Weighted Permutation Entropy and an Improved SVM Ensemble Classifier

by Qian, Silin , Zhou, Shenghan , Xiao, Yiyong in Decomposition , Fault diagnosis , hybrid voting strategy

2018

Timely and accurate state detection and fault diagnosis of rolling element bearings are very critical to ensuring the reliability of rotating machinery. This paper proposes a novel method of rolling bearing fault diagnosis based on a combination of ensemble empirical mode decomposition (EEMD), weighted permutation entropy (WPE) and an improved support vector machine (SVM) ensemble classifier. A hybrid voting (HV) strategy that combines SVM-based classifiers and cloud similarity measurement (CSM) was employed to improve the classification accuracy. First, the WPE value of the bearing vibration signal was calculated to detect the fault. Secondly, if a bearing fault occurred, the vibration signal was decomposed into a set of intrinsic mode functions (IMFs) by EEMD. The WPE values of the first several IMFs were calculated to form the fault feature vectors. Then, the SVM ensemble classifier was composed of binary SVM and the HV strategy to identify the bearing multi-fault types. Finally, the proposed model was fully evaluated by experiments and comparative studies. The results demonstrate that the proposed method can effectively detect bearing faults and maintain a high accuracy rate of fault recognition when a small number of training samples are available.

Journal Article

Share this book

Add to My Shelf

Object-Based Change Detection in Urban Areas from High Spatial Resolution Images Based on Multiple Features and Ensemble Learning

by Liang, Hao , Liu, Sicong , Xia, Junshi in Change detection , Classifiers , Ensemble learning

2018

To improve the accuracy of change detection in urban areas using bi-temporal high-resolution remote sensing images, a novel object-based change detection scheme combining multiple features and ensemble learning is proposed in this paper. Image segmentation is conducted to determine the objects in bi-temporal images separately. Subsequently, three kinds of object features, i.e., spectral, shape and texture, are extracted. Using the image differencing process, a difference image is generated and used as the input for nonlinear supervised classifiers, including k-nearest neighbor, support vector machine, extreme learning machine and random forest. Finally, the results of multiple classifiers are integrated using an ensemble rule called weighted voting to generate the final change detection result. Experimental results of two pairs of real high-resolution remote sensing datasets demonstrate that the proposed approach outperforms the traditional methods in terms of overall accuracy and generates change detection maps with a higher number of homogeneous regions in urban areas. Moreover, the influences of segmentation scale and the feature selection strategy on the change detection performance are also analyzed and discussed.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter