Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
58
result(s) for
"stochastic gradient boosting"
Sort by:
Prognostication of advanced CO2 capture using tunable solvents with an ensemble learning-based decision tree model
by
Soleimani, Reza
,
Saeedi Dehaghani, Amir Hossein
,
Farahani, Hamidreza
in
639/166
,
639/166/898
,
639/301/1034/1037
2025
This study presents a robust method for predicting CO
2
solubility in Deep Eutectic Solvents (DESs) using the stochastic gradient boosting (SGB) algorithm. DESs, promising green solvents for CO
2
capture, require precise solubility data for practical applications in industrial and environmental settings. The model incorporates key parameters such as temperature, pressure, mole percent of salt and hydrogen bond donor (HBD) compounds, HBD melting points, molecular weights of salts and HBDs, and other critical factors. Using a dataset of 1951 experimental data points spanning temperatures (293.15–343.15 K) and pressures (26.3–12,730 kPa), the SGB model demonstrated excellent predictive accuracy, achieving an R
2
of 0.9928 and an AARD% of 2.3107. Variable importance analysis identified pressure as the most influential factor. The model’s applicability, confirmed through William’s plot, encompassed 97.5% of data points within a safety margin, ensuring reliability, versatility, and broad applicability. Moreover, the SGB model outperformed previous methods, including ANN, RF, and thermodynamic models like PR-EoS and COSMO-RS, as validated by statistical metrics. This research highlights the SGB model’s potential as a superior and practical tool for evaluating CO
2
solubility in DESs, advancing the field of green solvent development for sustainable and efficient CO
2
capture technologies.
Journal Article
Fusion and Classification of SAR and Optical Data Using Multi-Image Color Components with Differential Gradients
2023
This paper proposes a gradient-based data fusion and classification approach for Synthetic Aperture Radar (SAR) and optical image. This method is used to intuitively reflect the boundaries and edges of land cover classes present in the dataset. For the fusion of SAR and optical images, Sentinel 1A and Sentinel 2B data covering Central State Farm in Hissar (India) was used. The major agricultural crops grown in this area include paddy, maize, cotton, and pulses during kharif (summer) and wheat, sugarcane, mustard, gram, and peas during rabi (winter) seasons. The gradient method using a Sobel operator and color components for three directions (i.e., x, y, and z) are used for image fusion. To judge the quality of fused image, several fusion metrics are calculated. After obtaining the resultant fused image, gradient based classification methods, including Stochastic Gradient Descent Classifier, Stochastic Gradient Boosting Classifier, and Extreme Gradient Boosting Classifier, are used for the final classification. The classification accuracy is represented using overall classification accuracy and kappa value. A comparison of classification results indicates a better performance by the Extreme Gradient Boosting Classifier.
Journal Article
Combining US and Canadian forest inventories to assess habitat suitability and migration potential of 25 tree species under climate change
by
McKenney, Dan
,
Prasad, Anantha
,
Matthews, Steve
in
Alaska
,
Assisted migration
,
BIODIVERSITY RESEARCH
2020
Aim To evaluate current and future dynamics of 25 tree species spanning United States and Canada. Location United States and Canada. Methods We combine, for the first time, the species compositions from relative importance derived from the USA’s Forest Inventory Analysis (FIA) with gridded estimates based on Canada's National Forest Inventory (NFI‐kNN))‐based photo plot data to evaluate future habitats and colonization potentials for 25 tree species. Using 21 climatic variables under RCP 4.5 and RCP 8.5, we model climatic habitat suitability (HQ) within a consensus‐based multimodel ensemble regression approach. A migration model is used to assess colonization likelihoods (CL) for ~100 years and combined with HQ to evaluate the various combinations of HQ + CL outcomes for the 25 species. Results At a continental scale, many species in the conterminous United States lose suitable climatic habitat (especially under RCP 8.5) while Canada and USA’s Alaska gain climate habitat. For most species, even under optimistic migration rates, only a small portion of overall future suitable habitat is projected to be naturally colonized in ~100 years, although considerable variation exists among species. Main conclusions For the species examined here, habitat losses were primarily experienced along southern range limits, while habitat gains were associated with northern range limits (especially under RCP 8.5). However, for many species, southern range limits are projected to remain relatively intact, albeit with reduced habitat quality. Our models predict that only a small portion of the climatic habitat generated by climate change will be colonized naturally by the end of the current century—even with optimistic tree migration rates. However, considerable variation among species points to the need for significant management efforts, including assisted migration, for economic or ecological reasons. Our work highlights the need to employ range‐wide data, evaluate colonization potentials and enhance cross‐border collaborations.
Journal Article
Estimating Aboveground Biomass of Two Different Forest Types in Myanmar from Sentinel-2 Data with Machine Learning and Geostatistical Algorithms
2022
The accurate estimation of spatially explicit forest aboveground biomass (AGB) provides an essential basis for sustainable forest management and carbon sequestration accounting, especially in Myanmar, where there is a lack of data for forest conservation due to operational limitations. This study mapped the forest AGB using Sentinel-2 (S-2) images and Shuttle Radar Topographic Mission (SRTM) based on random forest (RF), stochastic gradient boosting (SGB) and Kriging algorithms in two forest reserves (Namhton and Yinmar) in Myanmar, and compared their performance against AGB measured by the traditional methods. Specifically, a suite of forest sample plots were deployed in the two forest reserves, and forest attributes were measured to calculate the plot-level AGB based on allometric equations. The spectral bands, vegetation indices (VIs) and textures derived from processed S-2 data and topographic parameters from SRTM were utilized to statistically link with field-based AGB by implementing random forest (RF) and stochastic gradient boosting (SGB) algorithms. Followed by an evaluation of the algorithmic performances, RF-based Kriging (RFK) models were employed to determine the spatial distribution of AGB as an improvement of accuracy against RF models. The study’s results showed that textural measures produced from wavelet analysis (WA) and vegetation indices (VIs) from Sentinel-2 were the strongest predictors for evergreen forest reserve (Namhton) AGB prediction and spectral bands and vegetation indices (VIs) showed the highest sensitivity to the deciduous forest reserve (Yinmar) AGB prediction. The fitted models were RF-based ordinary Kriging (RFOK) for Namhton forest reserve and RF-based co-Kriging (RFCK) for Yinmar forest reserve because their respective R2, whilst the RMSE values were validated as 0.47 and 24.91 AGB t/ha and 0.52 and 34.72 AGB t/ha, respectively. The proposed random forest Kriging framework provides robust AGB maps, which are essential to estimate the carbon sequestration potential in the context of REDD+. From this particular study, we suggest that the protection/disturbance status of forests affects AGB values directly in the study area; thus, community-participated or engaged forest utilization and conservation initiatives are recommended to promote sustainable forest management.
Journal Article
Boosting for high-dimensional two-class prediction
2015
Background
In clinical research prediction models are used to accurately predict the outcome of the patients based on some of their characteristics. For high-dimensional prediction models (the number of variables greatly exceeds the number of samples) the choice of an appropriate classifier is crucial as it was observed that no single classification algorithm performs optimally for all types of data. Boosting was proposed as a method that combines the classification results obtained using base classifiers, where the sample weights are sequentially adjusted based on the performance in previous iterations. Generally boosting outperforms any individual classifier, but studies with high-dimensional data showed that the most standard boosting algorithm, AdaBoost.M1, cannot significantly improve the performance of its base classier. Recently other boosting algorithms were proposed (Gradient boosting, Stochastic Gradient boosting, LogitBoost); they were shown to perform better than AdaBoost.M1 but their performance was not evaluated for high-dimensional data.
Results
In this paper we use simulation studies and real gene-expression data sets to evaluate the performance of boosting algorithms when data are high-dimensional. Our results confirm that AdaBoost.M1 can perform poorly in this setting, often failing to improve the performance of its base classifier. We provide the explanation for this and propose a modification, AdaBoost.M1.ICV, which uses cross-validated estimates of the prediction errors and outperforms the original algorithm when data are high-dimensional. The use of AdaBoost.M1.ICV is advisable when the base classifier overfits the training data: the number of variables is large, the number of samples is small, and/or the difference between the classes is large. To a lesser extent also Gradient boosting suffers from similar problems. Contrary to the findings for the low-dimensional data, shrinkage does not improve the performance of Gradient boosting when data are high-dimensional, however it is beneficial for Stochastic Gradient boosting, which outperformed the other boosting algorithms in our analyses. LogitBoost suffers from overfitting and generally performs poorly.
Conclusions
The results show that boosting can substantially improve the performance of its base classifier also when data are high-dimensional. However, not all boosting algorithms perform equally well. LogitBoost, AdaBoost.M1 and Gradient boosting seem less useful for this type of data. Overall, Stochastic Gradient boosting with shrinkage and AdaBoost.M1.ICV seem to be the preferable choices for high-dimensional class-prediction.
Journal Article
Early prediction of heart disease with data analysis using supervised learning with stochastic gradient boosting
by
Aishwarya, Samudrala
,
Swetcha, Pandla
,
Anjani, Pendem
in
Accuracy
,
Algorithms
,
Cardiovascular disease
2023
Heart diseases are consistently ranked among the top causes of mortality on a global scale. Early detection and accurate heart disease prediction can help effectively manage and prevent the disease. However, the traditional methods have failed to improve heart disease classification performance. So, this article proposes a machine learning approach for heart disease prediction (HDP) using a decision tree-based random forest (DTRF) classifier with loss optimization. Initially, preprocessing of the dataset with patient records with known labels is performed for the presence or absence of heart disease records. Then, train a DTRF classifier on the dataset using stochastic gradient boosting (SGB) loss optimization technique and evaluate the classifier’s performance using a separate test dataset. The results demonstrate that the proposed HDP-DTRF approach resulted in 86% of precision, 86% of recall, 85% of F1-score, and 96% of accuracy on publicly available real-world datasets, which are higher than traditional methods.
Journal Article
R-GCN: a residual-gated recurrent unit convolution network model for anomaly detection in blockchain transactions
by
Sandhya, S. G.
,
Hu, Yu-Chen
,
Kumar, T. Ananth
in
Anomalies
,
Artificial intelligence
,
Blockchain
2024
The domain of deep learning has provided an exemplary paradigm for how Artificial Intelligence (AI) can be a disruptive technological paragon through Blockchain Technology (BT). Data experts have recently strived to find the quality of a dataset high enough for machine learning by an AI entity to be effective and efficient. Blockchain technology has become a special, innovative, and fashionable technological development. It also guarantees that the data is reliable and valid through its consensus process. However, new protection creates problems like data anonymity and confidentiality. Deep Learning (DL)-based blockchain data security is needed to deal with the problems mentioned above. This paper proposes an integration of the DL and BT systems, which produces highly reliable performance in enhancing data durability and dissemination. Moreover, a new convolution model called Residual-Gated recurrent unit Convolution Network (R-GCN) is proposed to analyze transactions in a blockchain-based platform using the Stochastic Gradient Boosting (SGB) technique. The proposed framework is implemented in the Ethereum environment using Anaconda and Python packages. Also, an analogy of how these models can be applied in a range of smart technologies, such as the Unmanned Aerial Vehicle (UAV), Smart Grid, healthcare, and green infrastructure, is illustrated.
Journal Article
Woody cover in African savannas: the role of resources, fire and herbivory
by
Ratnam, Jayashree
,
Hanan, Niall
,
Sankaran, Mahesh
in
Animal and plant ecology
,
Animal, plant and microbial ecology
,
Biological and medical sciences
2008
To determine the functional relationships between, and the relative importance of, different driver variables (mean annual precipitation, soil properties, fire and herbivory) in regulating woody plant cover across broad environmental gradients in African savannas. Savanna grasslands of East, West and Southern Africa. The dependence of woody cover on mean annual precipitation (MAP), soil properties (texture, nitrogen mineralization potential and total phosphorus), fire regimes, and herbivory (grazer, browser + mixed feeder, and elephant biomass) was determined for 161 savanna sites across Africa using stochastic gradient boosting, a refinement of the regression tree analysis technique. All variables were significant predictors of woody cover, collectively explaining 71% of the variance in our data set. However, their relative importance as regulators of woody cover varied. MAP was the most important predictor, followed by fire return periods, soil characteristics and herbivory regimes. Woody cover showed a strong positive dependence on MAP between 200 and 700 mm, but no dependence on MAP above this threshold when the effects of other predictors were accounted for. Fires served to reduce woody cover below rainfall-determined levels. Woody cover showed a complex, non-linear relationship with total soil phosphorus, and was negatively correlated with clay content. There was a strong negative dependence of woody cover on soil nitrogen (N) availability, suggesting that increased N-deposition may cause shifts in savannas towards more grassy states. Elephants, mixed feeders and browsers had negative effects on woody cover. Grazers, on the other hand, depressed woody cover at low biomass, but favoured woody vegetation when their biomass exceeded a certain threshold. Our results indicate complex and contrasting relationships between woody cover, rainfall, soil properties and disturbance regimes in savannas, and suggest that future environmental changes such as altered precipitation regimes, N-enrichment and elevated levels of CO₂ are likely to have opposing, and potentially interacting, influences on the tree-grass balance in savannas.
Journal Article
Machine learning algorithms accurately identify free-living marine nematode species
by
Brito de Jesus, Simone
,
Vieira, Danilo
,
Cunha, Beatriz P.
in
Acantholaimus
,
Algorithms
,
Artificial intelligence
2023
BackgroundIdentifying species, particularly small metazoans, remains a daunting challenge and the phylum Nematoda is no exception. Typically, nematode species are differentiated based on morphometry and the presence or absence of certain characters. However, recent advances in artificial intelligence, particularly machine learning (ML) algorithms, offer promising solutions for automating species identification, mostly in taxonomically complex groups. By training ML models with extensive datasets of accurately identified specimens, the models can learn to recognize patterns in nematodes’ morphological and morphometric features. This enables them to make precise identifications of newly encountered individuals. Implementing ML algorithms can improve the speed and accuracy of species identification and allow researchers to efficiently process vast amounts of data. Furthermore, it empowers non-taxonomists to make reliable identifications. The objective of this study is to evaluate the performance of ML algorithms in identifying species of free-living marine nematodes, focusing on two well-known genera: Acantholaimus Allgén, 1933 and Sabatieria Rouville, 1903.MethodsA total of 40 species of Acantholaimus and 60 species of Sabatieria were considered. The measurements and identifications were obtained from the original publications of species for both genera, this compilation included information regarding the presence or absence of specific characters, as well as morphometric data. To assess the performance of the species identification four ML algorithms were employed: Random Forest (RF), Stochastic Gradient Boosting (SGBoost), Support Vector Machine (SVM) with both linear and radial kernels, and K-nearest neighbor (KNN) algorithms.ResultsFor both genera, the random forest (RF) algorithm demonstrated the highest accuracy in correctly classifying specimens into their respective species, achieving an accuracy rate of 93% for Acantholaimus and 100% for Sabatieria, only a single individual from Acantholaimus of the test data was misclassified.ConclusionThese results highlight the overall effectiveness of ML algorithms in species identification. Moreover, it demonstrates that the identification of marine nematodes can be automated, optimizing biodiversity and ecological studies, as well as turning species identification more accessible, efficient, and scalable. Ultimately it will contribute to our understanding and conservation of biodiversity.
Journal Article
Using machine learning involving diagnoses and medications as a risk prediction tool for post-acute sequelae of COVID-19 (PASC) in primary care
2025
Background
The aim of our study was to determine whether the application of machine learning could predict PASC by using diagnoses from primary care and prescribed medication 1 year prior to PASC diagnosis.
Methods
This population-based case–control study included subjects aged 18–65 years from Sweden. Stochastic gradient boosting was used to develop a predictive model using diagnoses received in primary care, hospitalization due to acute COVID- 19, and prescribed medication. The variables with normalized relative influence (NRI) ≥ 1% showed were considered predictive. Odds ratios of marginal effects (OR
ME
) were calculated.
Results
The study included 47,568 PASC cases and controls. More females (
n
= 5113) than males (
n
= 2815) were diagnosed with PASC. Key predictive factors identified in both sexes included prior hospitalization due to acute COVID- 19 (NRI 16.1%, OR
ME
18.8 for females; NRI 41.7%, OR
ME
31.6 for males), malaise and fatigue (NRI 14.5%, OR
ME
4.6 for females; NRI 11.5%, OR
ME
7.9 for males), and post-viral and related fatigue syndromes (NRI 10.1%, OR
ME
21.1 for females; NRI 6.4%, OR
ME
28.4 for males).
Conclusions
Machine learning can predict PASC based on previous diagnoses and medications. Use of this AI method could support diagnostics of PASC in primary care and provide insight into PASC etiology.
Journal Article