Catalogue Search | MBRL

Mapping the Growing Stem Volume of the Coniferous Plantations in North China Using Multispectral Data from Integrated GF-2 and Sentinel-2 Images and an Optimized Feature Variable Selection Method

by Li, Xinyu , Long, Jiangping , Xu, Xiaodong in Accuracy , adaptive feature variable selection , Algorithms

2021

Accurate measurement of forest growing stem volume (GSV) is important for forest resource management and ecosystem dynamics monitoring. Optical remote sensing imagery has great application prospects in forest GSV estimation on regional and global scales as it is easily accessible, has a wide coverage, and mature technology. However, their application is limited by cloud coverage, data stripes, atmospheric effects, and satellite sensor errors. Combining multi-sensor data can reduce such limitations as it increases the data availability, but also causes the multi-dimensional problem that increases the difficulty of feature selection. In this study, GaoFen-2 (GF-2) and Sentinel-2 images were integrated, and feature variables and data scenarios were derived by a proposed adaptive feature variable combination optimization (AFCO) program for estimating the GSV of coniferous plantations. The AFCO algorithm was compared to four traditional feature variable selection methods, namely, random forest (RF), stepwise random forest (SRF), fast iterative feature selection method for k-nearest neighbors (KNN-FIFS), and the feature variable screening and combination optimization procedure based on the distance correlation coefficient and k-nearest neighbors (DC-FSCK). The comparison indicated that the AFCO program not only considered the combination effect of feature variables, but also optimized the selection of the first feature variable, error threshold, and selection of the estimation model. Furthermore, we selected feature variables from three datasets (GF-2, Sentinel-2, and the integrated data) following the AFCO and four other feature selection methods and used the k-nearest neighbors (KNN) and random forest regression (RFR) to estimate the GSV of coniferous plantations in northern China. The results indicated that the integrated data improved the GSV estimation accuracy of coniferous plantations, with relative root mean square errors (RMSErs) of 15.0% and 19.6%, which were lower than those of GF-2 and Sentinel-2 data, respectively. In particular, the texture feature variables derived from GF-2 red band image have a significant impact on GSV estimation performance of the integrated dataset. For most data scenarios, the AFCO algorithm gained more accurate GSV estimates, as the RMSErs were 30.0%, 23.7%, 17.7%, and 17.5% lower than those of RF, SRF, KNN-FIFS, and DC-FSCK, respectively. The GSV distribution map obtained by the AFCO method and RFR model matched the field observations well. This study provides some insight into the application of optical images, optimization of the feature variable combination, and modeling algorithm selection for estimating the GSV of coniferous plantations.

Journal Article

Share this book

Add to My Shelf

Evaluating Calibration and Spectral Variable Selection Methods for Predicting Three Soil Nutrients Using Vis-NIR Spectroscopy

by Li, Ting , Gao, Han , Huang, Yanru in Accuracy , Adaptive sampling , Agricultural land

2021

Soil nutrients, including soil available potassium (SAK), soil available phosphorous (SAP), and soil organic matter (SOM), play an important role in farmland soil productivity, food security, and agricultural management. Spectroscopic analysis has proven to be a rapid, nondestructive, and effective technique for predicting soil properties in general and potassium, phosphorous, and organic matter in particular. However, the successful estimation of soil nutrient content by visible and near-infrared (Vis-NIR) reflectance spectroscopy depends on proper calibration methods (including preprocessing transformation methods and multivariate methods for regression analysis) and the selection of appropriate variable selection techniques. In this study, raw spectrum and 13 preprocessing transformations combined with 2 variable selection methods (competitive adaptive reweighted sampling (CARS) and the successive projections algorithm (SPA)) and 2 regression algorithms (support vector machine (SVM) and partial least squares regression (PLSR)), for a total of 56 calibration methods, were investigated for modeling and predicting the above three soil nutrients using hyperspectral Vis-NIR data (400–2450 nm). The results show that first-order derivatives based on logarithmic and inverse transformations (FD-LGRs) can provide better predictions of soil available potassium and phosphorous, and the best form of soil organic matter transformation is SG+MSC. CARS was superior to the SPA in selecting effective variables, and the PLSR model outperformed the SVM models. The best estimation accuracies (R2, RMSE) for soil available potassium, phosphorous, and organic matter were 0.7532, 32.3090 mg/kg; 0.7440, 6.6910 mg/kg; and 0.9009, 3.2103 g/kg, respectively, and their corresponding calibration methods were (FD-LGR)/SPA/PLSR, (FD-LGR)/SPA/PLSR, and SG+MSC/CARS/SVM, respectively. Overall, for the prediction of the soil nutrient content, organic matter was superior to available phosphorous, followed by available potassium. It was concluded that the application of hyperspectral images (Vis-NIR data) was an efficient method for mapping and monitoring soil nutrients at the regional scale, thus contributing to the development of precision agriculture.

Journal Article

Share this book

Add to My Shelf

Towards Optimal Variable Selection Methods for Soil Property Prediction Using a Regional Soil Vis-NIR Spectral Library

by Chen, Songchao , Zhang, Xianglin , Shi, Zhou in Accuracy , Adaptive algorithms , Adaptive sampling

2023

Soil visible and near-infrared (Vis-NIR, 350–2500 nm) spectroscopy has been proven as an alternative to conventional laboratory analysis due to its advantages being rapid, cost-effective, non-destructive and environmentally friendly. Different variable selection methods have been used to deal with the high redundancy, heavy computation, and model complexity of using full spectra in spectral modelling. However, most previous studies used a linear algorithm in the variable selection, and the application of a non-linear algorithm remains poorly explored. To address the current knowledge gap, based on a regional soil Vis-NIR spectral library (1430 soil samples), we evaluated seven variable selection algorithms together with three predictive algorithms in predicting seven soil properties. Our results showed that Cubist overperformed partial least squares regression (PLSR) and random forests (RF) in most soil properties (R2 > 0.75 for soil organic matter, total nitrogen and pH) when using the full spectra. Most of variable selection can greatly reduce the number of spectral bands and therefore simplified predictive models without losing accuracy. The results also showed that there was no silver bullet for the optimal variable selection algorithm among different predictive algorithms: (1) competitive adaptive reweighted sampling (CARS) always performed best for the PLSR algorithm, followed by forward recursive feature selection (FRFS); (2) recursive feature elimination (RFE) and genetic algorithm (GA) generally had better accuracy than others for the Cubist algorithm; and (3) FRFS had the best model performance for the RF algorithm. In addition, the performance was generally better when the algorithm used in the variable selection matched the predictive algorithm. The outcome of this study provides a valuable reference for predicting soil information using spectroscopic techniques together with variable selection algorithms.

Journal Article

Share this book

Add to My Shelf

Non-destructive prediction and visualization of anthocyanin content in mulberry fruits using hyperspectral imaging

by Li, Xunlan , Han, Guohui , Liu, Jianfei in Adaptive algorithms , Adaptive sampling , Algorithms

2023

Being rich in anthocyanin is one of the most important physiological traits of mulberry fruits. Efficient and non-destructive detection of anthocyanin content and distribution in fruits is important for the breeding, cultivation, harvesting and selling of them. This study aims at building a fast, non-destructive, and high-precision method for detecting and visualizing anthocyanin content of mulberry fruit by using hyperspectral imaging. Visible near-infrared hyperspectral images of the fruits of two varieties at three maturity stages are collected. Successive projections algorithm (SPA), competitive adaptive reweighted sampling (CARS) and stacked auto-encoder (SAE) are used to reduce the dimension of high-dimensional hyperspectral data. The least squares-support vector machine and extreme learning machine (ELM) are used to build models for predicting the anthocyanin content of mulberry fruit. And genetic algorithm (GA) is used to optimize the major parameters of models. The results show that the higher the anthocyanin content is, the lower the spectral reflectance is. 15, 7 and 13 characteristic variables are extracted by applying CARS, SPA and SAE respectively. The model based on SAE-GA-ELM achieved the best performance with R 2 of 0.97 and the RMSE of 0.22 mg/g in both the training set and testing set, and it is applied to retrieve the distribution of anthocyanin content in mulberry fruits. By applying SAE-GA-ELM model to each pixel of the mulberry fruit images, distribution maps are created to visualize the changes in anthocyanin content of mulberry fruits at three maturity stages. The overall results indicate that hyperspectral imaging, in combination with SAE-GA-ELM, can help achieve rapid, non-destructive and high-precision detection and visualization of anthocyanin content in mulberry fruits.

Journal Article

Share this book

Add to My Shelf

Bayesian Neural Networks for Selection of Drug Sensitive Genes

by Li, Qizhai , Zhou, Lei , Liang, Faming in Adaptive algorithms , Algorithms , antineoplastic agents

2018

Recent advances in high-throughput biotechnologies have provided an unprecedented opportunity for biomarker discovery, which, from a statistical point of view, can be cast as a variable selection problem. This problem is challenging due to the high-dimensional and nonlinear nature of omics data and, in general, it suffers three difficulties: (i) an unknown functional form of the nonlinear system, (ii) variable selection consistency, and (iii) high-demanding computation. To circumvent the first difficulty, we employ a feed-forward neural network to approximate the unknown nonlinear function motivated by its universal approximation ability. To circumvent the second difficulty, we conduct structure selection for the neural network, which induces variable selection, by choosing appropriate prior distributions that lead to the consistency of variable selection. To circumvent the third difficulty, we implement the population stochastic approximation Monte Carlo algorithm, a parallel adaptive Markov Chain Monte Carlo algorithm, on the OpenMP platform that provides a linear speedup for the simulation with the number of cores of the computer. The numerical results indicate that the proposed method can work very well for identification of relevant variables for high-dimensional nonlinear systems. The proposed method is successfully applied to identification of the genes that are associated with anticancer drug sensitivities based on the data collected in the cancer cell line encyclopedia study. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

Estimation of Organic Carbon in Anthropogenic Soil by VIS-NIR Spectroscopy: Effect of Variable Selection

by Hong, Yongsheng , Fei, Teng , Xu, Lu in Accuracy , Adaptive algorithms , Adaptive sampling

2020

Visible and near-infrared reflectance (VIS-NIR) spectroscopy is widely applied to estimate soil organic carbon (SOC). Intense and diverse human activities increase the heterogeneity in the relationships between SOC and VIS-NIR spectra in anthropogenic soil. This fact results in poor performance of SOC estimation models. To improve model accuracy and parsimony, we investigated the performance of two variable selection algorithms, namely competitive adaptive reweighted sampling (CARS) and random frog (RF), coupled with five spectral pretreatments. A total of 108 samples were collected from Jianghan Plain, China, with the SOC content and VIS-NIR spectra measured in the laboratory. Results showed that both CARS and RF coupled with partial least squares regression (PLSR) outperformed PLSR alone in terms of higher model accuracy and less spectral variables. It revealed that spectral variable selection could identify important spectral variables that account for the relationships between SOC and VIS-NIR spectra, thereby improving the accuracy and parsimony of PLSR models in anthropogenic soil. Our findings are of significant practical value to the SOC estimation in anthropogenic soil by VIS-NIR spectroscopy.

Journal Article

Share this book

Add to My Shelf

Adaptive deep SVM for detecting early heart disease among cardiac patients

by Netra, S. N. , Srinidhi, N. N. , Naresh, E. in 631/114/1305 , 631/114/2413 , 692/700/228

2025

Heart attack is one of the most common heart diseases, which causes more deaths worldwide. Early detection and continuous monitoring are essential in reducing the death rate caused by heart diseases. Machine learning gives a promising solution for early and accurate heart disease detection by analyzing the data from healthcare devices. Although existing studies have employed various machine learning techniques to detect heart disease, most of the techniques still face challenges in handling large healthcare datasets that affect the prediction outcomes. To solve this issue, the research work focuses on developing a novel framework for detecting heart disease in its early stages by using machine learning techniques. In the initial phase, the significant data required for the validation is collected from benchmark resources, and it is subjected to the weighted optimal features selection phase. Here, from the input data, the features are selected optimally and their weights are tuned using Enhanced Arbitrary Variable-based Ship Rescue Optimization (EAVSRO). Further, the optimally selected weighted features are fed into the detection phase. In this phase, an Adaptive Deep Support Vector Machine (AD-SVM) is employed to detect heart diseases. Once heart disease is detected, the Atrial Fibrillation (AF) rate is determined using the Adaptive Multiscale Convolution Capsule Network (AMCCNet). Finally, the AF rate is obtained from the developed AMCCNet, and its parameters are tuned using the same EAVSRO. Later, various experiments are performed in the recommended heart disease detection model over existing models to verify its effectiveness. The accuracy of the designed framework is 96.07%, which is enhanced than the other existing frameworks like CNN-LSTM, DCNN, Adaboost and SVM, respectively. Thus, the results proved that the developed model can effectively detect heart disease at the early stages and identify the AF rate, providing timely treatments.

Journal Article

Share this book

Add to My Shelf

An Improved Robust Thermal Error Prediction Approach for CNC Machine Tools

by Ye, Honghan , Miao, Enming , Zhuang, Xindong in Accuracy , Adaptive algorithms , adaptive LASSO

2022

Thermal errors significantly affect the accurate performance of computer numerical control (CNC) machine tools. In this paper, an improved robust thermal error prediction approach is proposed for CNC machine tools based on the adaptive Least Absolute Shrinkage and Selection Operator (LASSO) and eXtreme Gradient Boosting (XGBoost) algorithms. Specifically, the adaptive LASSO method enjoys the oracle property of selecting temperature-sensitive variables. After the temperature-sensitive variable selection, the XGBoost algorithm is further adopted to model and predict thermal errors. Since the XGBoost algorithm is decision tree based, it has natural advantages to address the multicollinearity and provide interpretable results. Furthermore, based on the experimental data from the Vcenter-55 type 3-axis vertical machining center, the proposed algorithm is compared with benchmark methods to demonstrate its superior performance on prediction accuracy with 7.05 μm (over 14.5% improvement), robustness with 5.61 μm (over 12.9% improvement), worst-case scenario predictions with 16.49 μm (over 25.0% improvement), and percentage errors with 13.33% (over 10.7% improvement). Finally, the real-world applicability of the proposed model is verified through thermal error compensation experiments.

Journal Article

Share this book

Add to My Shelf

An Enhanced Multivariate EWMA Approach with Variable Selection and Adaptive Sampling for Efficient Process Monitoring

by Tang, Anan , Xu, Juncheng , Ma, Yuanman in Adaptive sampling , Automotive bodies , Control charts

2026

Due to the curse of dimensionality faced in modern industrial processes, high-dimensional Statistical Process Control (SPC) faces significant challenges in detecting small and sparse process shifts. Traditional multivariate control charts often suffer from noise accumulation and fail at timely identification of anomalies that affect only a small subset of variables. To address this issue, this study proposes an enhanced Multivariate Exponentially Weighted Moving Average (MEWMA) approach with variable selection and adaptive sampling for efficient process monitoring. The proposed smart approach works in two ways: first, it automatically focuses on the variables that are most likely to have changed (variable selection); second, it takes samples more frequently when things look uncertain, and less frequently when everything appears stable (variable sampling interval). This combination allows problems to be detected earlier. A Monte Carlo approach is used to calculate the the Average Time to Signal (ATS) values of the proposed scheme, and comparative results show that the proposed scheme outperforms standard charts like the Fixed Sampling Intervals (FSI) VSME, VSI-T2, and VSI-MEWMA schemes in terms of detection speed for small-to-moderate sparse shifts. Finally, a real example from car body manufacturing is provided as an illustration for the implementation of the proposed scheme.

Journal Article

Share this book

Add to My Shelf

Grafted Composite Decision Tree: Adaptive Online Fault Diagnosis with Automated Robot Measurements

by Kim, Sungmin , Do, Youndo , Zhang, Fan in adaptive fault diagnosis , Algorithms , Analysis

2025

In many industrial facilities, online monitoring systems have improved the reliability of key equipment, reducing the cost of operation and maintenance over recent decades. However, it often requires additional on-site inspection of target facilities due to limited information from installed sensors. To systematically automate such processes, an adaptive online fault diagnosis framework is required, which consecutively selects variables to measure and updates its inference with additional information at each measurement step. In this paper, adaptive online fault detection models—grafted composite decision trees—are proposed for such a framework. While conventional decision trees themselves can serve two required objectives of the framework, information from monitored variables can be less utilized because decision trees do not consider if required input variables are always monitored when the models are trained. On the other hand, the proposed grafted composite decision tree models are designed to fully utilize both monitored and robot-measured variables at any stage in a given measurement sequence by grafting two types of trees together: a prior-tree trained only with observed variables and sub-trees trained with robot-measurable variables. The proposed method was validated on a cooling water system in a nuclear power plant with multiple leak scenarios, in which improved measurement selection and increase in inference confidence in each measurement step are demonstrated. The performance comparison between the proposed models and the conventional decision tree model clearly illustrates how the acquired information is fully utilized for the best inference while providing the best choice of the next variable to measure, maximizing information gain at the same time.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter