Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
1,154
result(s) for
"stacking algorithm"
Sort by:
Mapping the Forest Canopy Height in Northern China by Synergizing ICESat-2 with Sentinel-2 Using a Stacking Algorithm
2021
The forest canopy height (FCH) plays a critical role in forest quality evaluation and resource management. The accurate and rapid estimation and mapping of the regional forest canopy height is crucial for understanding vegetation growth processes and the internal structure of the ecosystem. A stacking algorithm consisting of multiple linear regression (MLR), support vector machine (SVM), k-nearest neighbor (kNN), and random forest (RF) was used in this paper and demonstrated optimal performance in predicting the forest canopy height by synergizing Sentinel-2 images acquired from the cloud-based computation platform Google Earth Engine (GEE) with data from ICESat-2 (Ice, Cloud, and Land Elevation Satellite-2). This research was conducted to achieve continuous mapping of the canopy height of plantations in Saihanba Mechanical Forest Plantation, which is located in Chengde City, northern Hebei province, China. The results show that stacking achieved the best prediction accuracy for the forest canopy height, with an R2 of 0.77 and a root mean square error (RMSE) of 1.96 m. Compared with MLR, SVM, kNN, and RF, the RMSE obtained by stacking was reduced by 25.2%, 24.9%, 22.8%, and 18.7%, respectively. Since Sentinel-2 images and ICESat-2 data are publicly available, this opens the door for the accurate mapping of the continuous distribution of the forest canopy height globally in the future.
Journal Article
A novel method for determining postmortem interval based on the metabolomics of multiple organs combined with ensemble learning techniques
2023
Determining postmortem interval (PMI) is one of the most challenging and essential endeavors in forensic science. Developments in PMI estimation can take advantage of machine learning techniques. Currently, applying an algorithm to obtain information on multiple organs and conducting joint analysis to accurately estimate PMI are still in the early stages. This study aimed to establish a multi-organ stacking model that estimates PMI by analyzing differential compounds of four organs in rats. In a total of 140 rats, skeletal muscle, liver, lung, and kidney tissue samples were collected at each time point after death. Ultra-performance liquid chromatography coupled with high-resolution mass spectrometry was used to determine the compound profiles of the samples. The original data were preprocessed using multivariate statistical analysis to determine discriminant compounds. In addition, three interrelated and increasingly complex patterns (single organ optimal model, single organ stacking model, multi-organ stacking model) were established to estimate PMI. The accuracy and generalized area under the receiver operating characteristic curve of the multi-organ stacking model were the highest at 93% and 0.96, respectively. Only 1 of the 14 external validation samples was misclassified by the multi-organ stacking model. The results demonstrate that the application of the multi-organ combination to the stacking algorithm is a potential forensic tool for the accurate estimation of PMI.
Journal Article
Exploring the Microteaching model of English courses in the digital era
2024
The application of microteaching in English teaching can not only create a good atmosphere for learning English but also deepen the change of teaching mode and promote English teaching to deep development. In this paper, the output-oriented model is combined with microclasses to establish a blended teaching model for English. For students’ online learning behavior, the K-means clustering algorithm is used to classify them, and the GBDT algorithm and integrated learning Stacking algorithm are combined to predict their performance. Taking the online learning behavior data under the blended teaching mode as an example, a practical analysis was carried out in three aspects: the classification of students’ English online learning behavior, the prediction of academic performance, and the satisfaction of the blended teaching mode. The results show that the number of clusters of English online learning behavior is 5, and the comprehensive score of cluster 1 reaches 91.52. When the progress of blended teaching in the English course developed from 10% to 100%, the accuracy of students’ learning performance prediction increased by 20.97 percentage points. Students’ satisfaction with the “output-oriented method + micro-teaching” blended teaching mode reached 86.34% with grade B and above. The blended teaching mode of combining microteaching and English courses in the digital era can improve students’ independent learning ability and enhance their English application ability.
Journal Article
Soil Moisture Inversion Using Multi-Sensor Remote Sensing Data Based on Feature Selection Method and Adaptive Stacking Algorithm
2025
Soil moisture (SM) profoundly influences crop growth, yield, soil temperature regulation, and ecological balance maintenance and plays a pivotal role in water resources management and regulation. The focal objective of this investigation is to identify feature parameters closely associated with soil moisture through the implementation of feature selection methods on multi-source remote sensing data. Specifically, three feature selection methods, namely SHApley Additive exPlanations (SHAP), information gain (Info-gain), and Info_gain ∩ SHAP were validated in this study. The multi-source remote sensing data collected from Sentinel-1, Landsat-8, and Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTGTM DEM) enabled the derivation of 25 characteristic parameters through sound computational approaches. Subsequently, a stacking algorithm integrating multiple machine-learning (ML) algorithms based on adaptive learning was engineered to accomplish soil moisture prediction. The attained prediction outcomes were then juxtaposed against those of single models, including Random Forest (RF), Adaptive Boosting (AdaBoost), Gradient Boosting Decision Tree (GBDT), Light Gradient Boosting Machine (LightGBM), Extreme Gradient Boosting (XGBoost), and Categorical Boosting (CatBoost). Notably, the adoption of feature factors selected by the Info_gain algorithm in combination with the adaptive stacking (Ada-Stacking) algorithm yielded the most optimal soil moisture prediction results. Specifically, the Mean Absolute Error (MAE) was determined to be 1.86 Vol. %, the Root Mean Square Error (RMSE) amounted to 2.68 Vol. %, and the R-squared (R2) reached 0.95. The multifactor integrated model that harnessed optical remote sensing data, radar backscatter coefficients, and topographic data exhibited remarkable accuracy in soil surface moisture retrieval, thus providing valuable insights for soil moisture inversion studies in the designated study area. Furthermore, the Ada-Stacking algorithm demonstrated its potency in integrating multiple models, thereby elevating retrieval accuracy and overcoming the limitations inherent in a single ML model.
Journal Article
A Multi-Clustering Algorithm to Solve Driving Cycle Prediction Problems Based on Unbalanced Data Sets: A Chinese Case Study
by
Zhang, Wutong
,
Yang, Jie
,
Wu, Yuewei
in
driving cycle
,
multi-clustering algorithm
,
stacking algorithm
2020
Vehicle evaluation parameters, which are increasingly of concern for governments and consumers, quantify performance indicators, such as vehicle performance, emissions, and driving experience to help guide consumers in purchasing cars. While past approaches for driving cycle prediction have been proven effective and used in many countries, these algorithms are difficult to use in China with its complex traffic environment and increasingly high frequency of traffic jams. Meanwhile, we found that the vehicle dataset used by the driving cycle prediction problem is usually unbalanced in real cases, which means that there are more medium and high speed samples and very few samples at low and ultra-high speeds. If the ordinary clustering algorithm is directly applied to the unbalanced data, it will have a huge impact on the performance to build driving cycle maps, and the parameters of the map will deviate considerable from actual ones. In order to address these issues, this paper propose a novel driving cycle map algorithm framework based on an ensemble learning method named multi-clustering algorithm, to improve the performance of traditional clustering algorithms on unbalanced data sets. It is noteworthy that our model framework can be easily extended to other complicated structure areas due to its flexible modular design and parameter configuration. Finally, we tested our method based on actual traffic data generated in Fujian Province in China. The results prove the multi-clustering algorithm has excellent performance on our dataset.
Journal Article
Optimization of product marketing and management path of cross-border e-commerce enterprises relying on big data technology
2024
Precision marketing is an intrinsic motivation for enterprises as a strategic consideration and guiding principle. In this paper, we first constructed a cross-border e-commerce precision marketing and management strategy based on big data technology. We mined user data and analyzed user behavior and characteristics. Moreover, the initial group division of users is carried out to realize the construction of user segmentation and user portrait, followed by the prediction of user purchasing behavior based on the Stacking algorithm and the use of a collaborative filtering algorithm to carry out accurate recommendations for different user groups. The effectiveness of the marketing management strategy is evaluated based on the level of customer value perception of Enterprise H. By using regression modeling, the impact of the marketing management method on customer loyalty and user purchase intention is investigated. The results show that the perceived level of each dimension of precision marketing is between (4,5.6), and the perceived risk is 3.657. The degree of explanation of the precision marketing model on the customer’s willingness to buy is 78.8%, and the t-value significance is 0.005, which reaches a significant level, indicating that the marketing management model is effective. The purpose of this study is to provide practical marketing management suggestions for enterprises that can obtain and maintain competitive advantages in fierce market competition, which will promote enterprise performance improvement and stable growth.
Journal Article
Optimization of used price assessment model for new energy vehicles using machine learning
2025
The booming development of the new energy vehicle industry has led to the development of the new energy vehicle used market. The study introduces machine learning into the price evaluation of new energy used cars to alleviate the cost of new energy used transactions and improve the transaction efficiency. After collecting and pre-processing the transaction data of new energy automobile second-hand market, the valuation index system is established by selecting new energy second-hand vehicle valuation candidate indicators. The mean absolute error (MAE), root mean square error (RMSE), R-squared value (R2) and mean absolute percentage error (MAPE) are used as the indicators to measure the valuation performance. Other machine learning algorithms are fused into the Stacking algorithm to construct a new energy used car valuation model based on Stacking fusion. Empirical analysis is conducted to test the effectiveness of the Stacking valuation model on new energy price assessment. Among the first 100 samples to be valued, the maximum price difference between the appraisal price of the Stacking fusion valuation model in this paper and the actual results of the samples is 45,800 yuan. The percentage of absolute error in the appraisal of the traditional market approach for the cases to be appraised reaches 14.70%, which is much higher than that of the Stacking fusion valuation model. The valuation goodness of fit of the Stacking fusion valuation model and the SVM model in this paper are 0.989 and 0.875, respectively. The valuation price error of the traditional market comparison method fluctuates greatly between −8~80,000 yuan, and the valuation error of the Stacking fusion valuation model in this paper is concentrated in the range of [−2~2]. The Stacking fusion valuation model in this paper has obvious advantages in the valuation of new energy used cars.
Journal Article
Estimating the Growing Stem Volume of Chinese Pine and Larch Plantations based on Fused Optical Data Using an Improved Variable Screening Method and Stacking Algorithm
2020
Accurately estimating growing stem volume (GSV) is very important for forest resource management. The GSV estimation is affected by remote sensing images, variable selection methods, and estimation algorithms. Optical images have been widely used for modeling key attributes of forest stands, including GSV and aboveground biomass (AGB), because of their easy availability, large coverage and related mature data processing and analysis technologies. However, the low data saturation level and the difficulty of selecting feature variables from optical images often impede the improvement of estimation accuracy. In this research, two GaoFen-2 (GF-2) images, a Landsat 8 image, and fused images created by integrating GF-2 bands with the Landsat multispectral image using the Gram–Schmidt method were first used to derive various feature variables and obtain various datasets or data scenarios. A DC-FSCK approach that integrates feature variable screening and a combination optimization procedure based on the distance correlation coefficient and k-nearest neighbors (kNN) algorithm was proposed and compared with the stepwise regression analysis (SRA) and random forest (RF) for feature variable selection. The DC-FSCK considers the self-correlation and combination effect among feature variables so that the selected variables can improve the accuracy and saturation level of GSV estimation. To validate the proposed approach, six estimation algorithms were examined and compared, including Multiple Linear Regression (MLR), kNN, Support Vector Regression (SVR), RF, eXtreme Gradient Boosting (XGBoost) and Stacking. The results showed that compared with GF-2 and Landsat 8 images, overall, the fused image (Red_Landsat) of GF-2 red band with Landsat 8 multispectral image improved the GSV estimation accuracy of Chinese pine and larch plantations. The Red_Landsat image also performed better than other fused images (Pan_Landsat, Blue_Landsat, Green_Landsat and Nir_Landsat). For most of the combinations of the datasets and estimation models, the proposed variable selection method DC-FSCK led to more accurate GSV estimates compared with SRA and RF. In addition, in most of the combinations obtained by the datasets and variable selection methods, the Stacking algorithm performed better than other estimation models. More importantly, the combination of the fused image Red_Landsat with the DC-FSCK and Stacking algorithm led to the best performance of GSV estimation with the greatest adjusted coefficients of determination, 0.8127 and 0.6047, and the smallest relative root mean square errors of 17.1% and 20.7% for Chinese pine and larch, respectively. This study provided new insights on how to choose suitable optical images, variable selection methods and optimal modeling algorithms for the GSV estimation of Chinese pine and larch plantations.
Journal Article
Financial distress prediction based on ensemble feature selection and improved stacking algorithm
2025
PurposeWhile the Chinese securities market is booming, the phenomenon of listed companies falling into financial distress is also emerging, which affects the operation and development of enterprises and also jeopardizes the interests of investors. Therefore, it is important to understand how to accurately and reasonably predict the financial distress of enterprises.Design/methodology/approachIn the present study, ensemble feature selection (EFS) and improved stacking were used for financial distress prediction (FDP). Mutual information, analysis of variance (ANOVA), random forest (RF), genetic algorithms, and recursive feature elimination (RFE) were chosen for EFS to select features. Since there may be missing information when feeding the results of the base learner directly into the meta-learner, the features with high importance were fed into the meta-learner together. A screening layer was added to select the meta-learner with better performance. Finally, Optima hyperparameters were used for parameter tuning by the learners.FindingsAn empirical study was conducted with a sample of A-share listed companies in China. The F1-score of the model constructed using the features screened by EFS reached 84.55%, representing an improvement of 4.37% compared to the original features. To verify the effectiveness of improved stacking, benchmark model comparison experiments were conducted. Compared to the original stacking model, the accuracy of the improved stacking model was improved by 0.44%, and the F1-score was improved by 0.51%. In addition, the improved stacking model had the highest area under the curve (AUC) value (0.905) among all the compared models.Originality/valueCompared to previous models, the proposed FDP model has better performance, thus bridging the research gap of feature selection. The present study provides new ideas for stacking improvement research and a reference for subsequent research in this field.
Journal Article