Catalogue Search | MBRL

Prediction of air pollutant concentrations based on TCN-BiLSTM-DMAttention with STL decomposition

by Li, Wenlin , Jiang, Xuchu in 639/166 , 639/705 , 704/106

2023

A model with high accuracy and strong generalization performance is conducive to preventing serious pollution incidents and improving the decision-making ability of urban planning. This paper proposes a new neural network structure based on seasonal–trend decomposition using locally weighted scatterplot smoothing (Loess) (STL) and a dependency matrix attention mechanism (DMAttention) based on cosine similarity to predict the concentration of air pollutants. This method uses STL for series decomposition, temporal convolution, a bidirectional long short-term memory network (TCN-BiLSTM) for feature learning of the decomposed series, and DMAttention for interdependent moment feature emphasizing. In this paper, the long short-term memory network (LSTM) and the gated recurrent unit network (GRU) are set as the baseline models to design experiments. At the same time, to test the generalization performance of the model, short-term forecasts in hours were performed using PM 2.5 , PM 10 , SO 2 , NO 2 , CO, and O 3 data. The experimental results show that the model proposed in this paper is superior to the comparison model in terms of root mean square error (RMSE) and mean absolute percentage error (MAPE). The MAPE values of the 6 kinds of pollutants are 6.800%, 10.492%, 9.900%, 6.299%, 4.178%, and 7.304%, respectively. Compared with the baseline LSTM and GRU models, the average reduction is 49.111% and 43.212%, respectively.

Journal Article

Share this book

Add to My Shelf

Prediction of tide level based on variable weight combination of LightGBM and CNN-BiGRU model

by Su, Ye , Jiang, Xuchu in 639/166 , 639/705 , 704/158

2023

Accurate tide level prediction is crucial to human activities in coastal areas. Many practical applications show that compared with traditional harmonic analysis, long short-term memory (LSTM), gated recurrent units (GRUs) and other neural networks, along with ensemble learning models, such as light gradient boosting machine (LightGBM) and eXtreme gradient boosting (XGBoost), can achieve extremely high prediction accuracy in relatively stationary time series. Therefore, this paper proposes a variable weight combination model based on LightGBM and CNN-BiGRU with relevant research. It uses the variable weight combination method to weight and synthesize the prediction results of the two base models so that the combination model has a stronger ability to capture time series features and fits the data well. The experimental results show that in contrast to the base model LightGBM, the RMSE value and MAE value of the combination model are reduced by 43.2% and 44.7%, respectively; in contrast to the base model CNN-BiGRU, the RMSE value and MAE value of the combination model are reduced by 35.3% and 39.1%, respectively. This means that the variable weight combination model can greatly improve the accuracy of tide level prediction. In addition, we use tidal data from different geographical environments to further verify the good universality of the model. This study provides a new idea and method for tide prediction.

Journal Article

Share this book

Add to My Shelf

Feature selection for global tropospheric ozone prediction based on the BO-XGBoost-RFE algorithm

by Zhang, Biao , Zhang, Ying , Jiang, Xuchu in 639/705/1042 , 639/705/117 , 704/106

2022

Ozone is one of the most important air pollutants, with significant impacts on human health, regional air quality and ecosystems. In this study, we use geographic information and environmental information of the monitoring site of 5577 regions in the world from 2010 to 2014 as feature input to predict the long-term average ozone concentration of the site. A Bayesian optimization-based XGBoost-RFE feature selection model BO-XGBoost-RFE is proposed, and a variety of machine learning algorithms are used to predict ozone concentration based on the optimal feature subset. Since the selection of the underlying model hyperparameters is involved in the recursive feature selection process, different hyperparameter combinations will lead to differences in the feature subsets selected by the model, so that the feature subsets obtained by the model may not be optimal solutions. We combine the Bayesian optimization algorithm to adjust the parameters of recursive feature elimination based on XGBoost to obtain the optimal parameter combination and the optimal feature subset under the parameter combination. Experiments on long-term ozone concentration prediction on a global scale show that the prediction accuracy of the model after Bayesian optimized XGBoost-RFE feature selection is higher than that based on all features and on feature selection with Pearson correlation. Among the four prediction models, random forest obtained the highest prediction accuracy. The XGBoost prediction model achieved the greatest improvement in accuracy.

Journal Article

Share this book

Add to My Shelf

Prediction of air quality index based on the SSA-BiLSTM-LightGBM model

by Zhang, Xiaowen , Jiang, Xuchu , Li, Ying in 639/166 , 639/705 , 704/106

2023

The air quality index (AQI), as an indicator to describe the degree of air pollution and its impact on health, plays an important role in improving the quality of the atmospheric environment. Accurate prediction of the AQI can effectively serve people’s lives, reduce pollution control costs and improve the quality of the environment. In this paper, we constructed a combined prediction model based on real hourly AQI data in Beijing. First, we used singular spectrum analysis (SSA) to decompose the AQI data into different sequences, such as trend, oscillation component and noise. Then, bidirectional long short-term memory (BiLSTM) was introduced to predict the decomposed AQI data, and a light gradient boosting machine (LightGBM) was used to integrate the predicted results. The experimental results show that the prediction effect of SSA-BiLSTM-LightGBM for the AQI data set is good on the test set. The root mean squared error (RMSE) reaches 0.6897, the mean absolute error (MAE) reaches 0.4718, the symmetric mean absolute percentage error (SMAPE) reaches 1.2712%, and the adjusted R 2 reaches 0.9995.

Journal Article

Share this book

Add to My Shelf

Forecast and analysis of aircraft passenger satisfaction based on RF-RFE-LR model

by Zhang, Biao , Zhang, Ying , Jiang, Xuchu in 639/166/984 , 639/705/1041 , 639/705/117

2022

Airplanes have always been one of the first choices for people to travel because of their convenience and safety. However, due to the outbreak of the new coronavirus epidemic in 2020, the civil aviation industry of various countries in the world has encountered severe challenges. Predicting aircraft passenger satisfaction and excavating the main influencing factors can help airlines improve their services and gain advantages in difficult situations and competition. This paper proposes a RF-RFE-Logistic feature selection model to extract the influencing factors of passenger satisfaction. First, preliminary feature selection is performed using recursive feature elimination based on random forest (RF-RFE). Second, based on different classification models, KNN, logistic regression, random forest, Gaussian Naive Bayes, and BP neural network, the classification performance of the models before and after feature selection is compared, and the prediction model with the best classification performance is selected. Finally, based on the RF-RFE feature selection, combined with the logistic model, the factors affecting customer satisfaction are further extracted. The experimental results show that the RF-RFE model selects a feature subset containing 17 variables. In the classification prediction model, the random forest after RF-RFE feature selection shows the best classification performance. Finally, combined with the four important variables extracted by RF-RFE and logistic regression, further discussion is carried out, and suggestions are given for airlines to improve passenger satisfaction.

Journal Article

Share this book

Add to My Shelf

Research on information leakage in time series prediction based on empirical mode decomposition

by Yang, Xinyi , Li, Jingyi , Jiang, Xuchu in 639/705 , 704/172 , Attention mechanism

2024

Time series analysis predicts the future based on existing historical data and has a wide range of applications in finance, economics, meteorology, biology, engineering, and other fields. Although the combination of decomposition techniques and machine learning algorithms can effectively solve the problem of predicting nonstationary sequences, this kind of decomposition-integration-prediction strategy of the prediction method has serious defects. After the decomposition of the division of the training set and the test set, the information of the test set in the process of decomposition of the information leakage ultimately shows a high accuracy of the prediction of the illusionary. This paper proposes three improvement strategies for this type of “information leakage” problem: sliding window decomposition (SW-EMD), single training and multiple decomposition (STMP-EMD), and multiple training and multiple decomposition (MTMP-EMD). They are combined with a bidirectional multiscale temporal convolutional network (MSBTCN), bidirectional long- and short-term memory network (BiLSTM), and attention mechanism (DMAttention), which introduces a dependency matrix based on cosine similarity to be applied to water quality prediction. The experimental results show that the model achieves good performance in the prediction of three water quality indicators (pH, DO and KMnO 4 ), and the accuracies of the three models proposed in this paper are improved by 1.958% and 0.853% in terms of the RMSE and MAPE, respectively, compared with those of the mainstream LSTM models. The key contributions of this study include the following: (1) three methods are proposed to improve the class EMD decomposition, which can effectively solve the problem of “information leakage” that exists in the current models via class EMD decomposition; (2) the CEEMDAN-MSBTCN-BiLSTM-DMAttention model structure is innovated by combining improved class EMD decomposition methods; and (3) the three improved decomposition methods proposed in this paper can effectively solve the problem of “information leakage” and optimize the prediction model at the same time. This study provides an effective experimental method for water quality prediction and can effectively address the problem of “overfitting” models via class EMD decompositions during model training and testing.

Journal Article

Share this book

Add to My Shelf

Short-term power load forecasting based on the CEEMDAN-TCN-ESN model

by Zhang, Xiaowen , Huang, Jiacheng , Jiang, Xuchu in Biology and Life Sciences , China , Computer and Information Sciences

2023

Ensuring an adequate electric power supply while minimizing redundant generation is the main objective of power load forecasting, as this is essential for the power system to operate efficiently. Therefore, accurate power load forecasting is of great significance to save social resources and promote economic development. In the current study, a hybrid CEEMDAN-TCN-ESN forecasting model based on complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and higher-frequency and lower-frequency component reconstruction is proposed for short-term load forecasting research. In this paper, we select the historical national electricity load data of Panama as the research subject and make hourly forecasts of its electricity load data. The results show that the RMSE and MAE predicted by the CEEMDAN-TCN-ESN model on this dataset are 15.081 and 10.944, respectively, and R 2 is 0.994. Compared to the second-best model (CEEMDAN-TCN), the RMSE is reduced by 9.52%, and the MAE is reduced by 17.39%. The hybrid model proposed in this paper effectively extracts the complex features of short-term power load data and successfully merges subseries according to certain similar features. It learns the complex and varying features of higher-frequency series and the obvious regularity of the lower-frequency-trend series well, which could be applicable to real-world short-term power load forecasting work.

Journal Article

Share this book

Add to My Shelf

Time series prediction based on the variable weight combination of the T-GCN-Luong attention and GRU models

by Huang, Jiacheng , Guo, Yushu , Jiang, Xuchu in 639/166 , 639/705 , 704/172

2025

Due to the high uncertainties in temperature changes, traditional regression analysis and time series prediction methods fail to provide accurate temperature forecasts to reduce the impact of extreme weather on human society. Considering the spatiotemporal features of temperature changes, this paper proposes a variable weight combination model based on a temporal graph convolutional network (T-GCN), Luong attention network (LUA) and gated recurrent unit (GRU) network, which fully utilizes spatiotemporal information to predict future temperature changes more accurately. The model uses the T-GCN model to capture spatiotemporal features while introducing Luong attention to weight the inputs at different time steps to improve the prediction accuracy and further reduce the prediction error by fusing the outputs of the T-GCN-Luong attention and GRU models through the variable weight combination method. The results revealed that (1) the inclusion of spatial information significantly improved the effectiveness of the temperature predictions. (2) The Luong attention mechanism weights different time steps and improves the prediction accuracy of the T-GCN model. (3) The TGLAG combination model constructed via the variable weight method exhibited good predictive performance at 15 sites. Compared with that of the simple GRU model, the accuracy of the proposed model is improved by approximately 31.949% in terms of the root mean square error (RMSE) and 26.913% in terms of the mean absolute error (MAE). Compared with the second-best model, T-GCN-Luong attention, the TGLAG model yields a 5.946% lower RMSE and 9.535% lower MAE, which indicates that TGLAG has good application prospects in the field of temperature prediction.

Journal Article

Share this book

Add to My Shelf

Combining knowledge distillation and neural networks to predict protein secondary structure

by Zhang, Biao , Zhao, Lufei , Jiang, Xuchu in 631/114 , 631/1647/2258 , 631/61/475

2025

The secondary structure of a protein serves as the foundation for constructing its three-dimensional (3D) structure, which in turn is critical for determining its function and role in biological processes. Therefore, accurately predicting secondary structure not only facilitates the understanding of a protein’s 3D conformation but also provides essential insights into its interactions, functional mechanisms, and potential applications in biomedical research. Deep learning models are particularly effective in protein secondary structure prediction because of their ability to process complex sequence data and extract meaningful patterns, thereby increasing prediction accuracy and efficiency. This study proposes a combined model, ITBM-KD, which integrates an improved temporal convolutional network (TCN), bidirectional recurrent neural network (BiRNN), and multilayer perceptron (MLP) to increase the accuracy of protein secondary structure prediction for octapeptides and tripeptides. By combining one-hot encoding, word vector representation of physicochemical properties, and knowledge distillation with the ProtT5 model, the proposed model achieves excellent performance on multiple datasets. To evaluate its effectiveness, two classic datasets, TS115 and CB513, containing 115 and 513 protein datasets, respectively, were used. In addition, 15,078 protein data points collected from the PDB database from June 6, 2018, to June 6, 2020, were used to further verify the robustness and generalizability of the model. This study improves prediction accuracy and provides an essential model for understanding protein structure and function, especially in resource-limited settings.

Journal Article

Share this book

Add to My Shelf

Operational modal analysis using symbolic regression for a nonlinear vibration system

by Jiang, Feng , Jiang, Xuchu in Modal analysis , Nonlinear systems , Nonlinearity

2021

The existence of nonlinearity is an inevitable frequent occurrence that should be considered to accurately identify the modal parameters of a vibration system using operational modal analysis. A problem is that the traditional operational modal analysis method based on the linear modal theory is not applicable to modal parameter identification of vibration systems with nonlinearity. A solution is as follows: this paper is aimed at solving the problem by proposing a new operational modal analysis method to carry out modal parameter identification for a nonlinear vibration system. The new operational modal analysis method, based on the forced response and symbolic regression method without assuming any pre-existing information and only using mathematical symbols, is introduced to solve the problem by automatically searching for the expression structure and modal parameters of a system in nonlinear normal modes. The simulation result of a three-degrees-of-freedom nonlinear system reveals the high accuracy of the proposed operational modal analysis method in extracting the modal parameters. Then, a rod fastening rotor model is considered, and the capability of the proposed operational modal analysis method to precisely extract its modal parameters is further evaluated.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter