Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
173 result(s) for "multi-head self-attention"
Sort by:
Advancing Sophisticated Photochemistry Simulation in Atmospheric Numerical Models With Artificial Intelligence PhotoChemistry (AIPC) Scheme Using the Feature‐Mapping Subspace Self‐Attention Algorithm
Accurate simulation of atmospheric photochemistry is essential for air quality and climate studies but computationally expensive in three‐dimensional atmospheric models. Artificial intelligence (AI) algorithms show promise for accelerating photochemical simulations, but integrating them reliably into numerical models as replacements for complex mechanisms has been challenging, with success mostly limited to simplified schemes (e.g., 12 species). We present a novel AI PhotoChemistry (AIPC) scheme using the Feature‐Mapping Subspace Self‐Attention (FMSSA) algorithm, enabling fast, accurate, and stable online simulation of the full SAPRC‐99 mechanism (79 species, 229 reactions) within WRF‐Chem. Feature‐mapping subspace self‐attention reduces computational cost by 91% versus standard attention architectures via global feature mapping and subspace attention decomposition while maintaining high fidelity to nonlinear chemistry. Offline evaluations show FMSSA's superior accuracy (mean NRMSE = 3.09% for 69 species) over Multi‐Layer Perceptron and Residual Neural Network baselines, especially for ozone. Ablation experiments confirm the critical role of attention and LayerNorm modules for accuracy and generalizability. Monthly‐scale online simulations conducted in August show stable FMSSA‐AIPC performance, accurately reproducing species spatiotemporal distributions with 77% faster computation than the numerical solver. However, simulations conducted in February show performance degradation for all AIPC schemes, with FMSSA‐AIPC exhibiting unique synchronous errors, highlighting generalization challenges across significantly distinct atmospheric regimes. This work advances integrating sophisticated chemical processes in weather and climate models, with future efforts targeting expanded training data sets, architectural refinements and broader spatiotemporal testing. Plain Language Summary Atmospheric photochemistry critically influences air quality and climate change by modulating atmospheric composition, but simulating these processes within three‐dimensional atmospheric models is computationally expensive, particular for sophisticated mechanisms, hindering high‐resolution studies and integration into Earth system models. While artificial intelligence (AI) algorithms demonstrate potential for accelerating photochemical simulations, the highly nonlinear reaction networks of sophisticated mechanisms restrict the reliable integration of AI PhotoChemistry (AIPC) schemes into numerical models, with successful implementations predominantly limited to oversimplified mechanisms (e.g., 12 species). Here, we developed a novel AIPC scheme using the feature‐mapping subspace self‐attention (FMSSA) algorithm, which enables fast, accurate, and stable monthly‐scale online continuous simulations of the entire SAPRC‐99 mechanism (79 species, 229 reactions) within WRF‐Chem. FMSSA reduces computational time by 77% compared to traditional solvers and outperforms multi‐layer perceptron and residual neural network baselines, particularly for ozone. However, FMSSA exhibits unique synchronization errors during online continuous simulations when atmospheric conditions significantly differ from those in the training phase. This work advances the integration of complex photochemical mechanisms into weather and climate models, but future efforts are needed to extend FMSSA to more mechanisms and improve its stability across broader spatiotemporal conditions. Key Points Feature‐mapping subspace self‐attention (FMSSA) surpasses multi‐layer perceptron and residual neural network in modeling photochemistry The FMSSA‐based scheme enables accurate and stable simulations of full SAPRC‐99 photochemical mechanism within WRF‐Chem The FMSSA‐based scheme reduces computation time by 77% versus the SAPRC‐99 numerical scheme
A Dual Attention Mechanism for Detection and Assessment Pulmonary Fibrosis in Chest X-Ray Images
Pulmonary fibrosis is a progressive lung disease characterized by interstitial tissue fibrosis, with idiopathic pulmonary fibrosis being the most common subtype. The disease leads to persistent scarring and fibrotic remodeling of lung tissue following injury, and under current medical conditions, it remains incurable and irreversible. Chest X-ray imaging is widely used in clinical diagnosis; however, traditional interpretation methods face challenges in accurately detecting fibrotic lesions and quantifying disease severity. To address these limitations, we propose the Attention Feature Transformation Network (AFTNet) for automated identification and precise quantification of pulmonary fibrosis regions. AFTNet integrates two key modules—Spatial-Based Multi-Head Self-Attention (SB-MSA) and Channel-Based Multi-Head Self-Attention (CB-MSA)—which enhance the model’s ability to capture subtle features in chest X-ray images. In addition, a specially designed skip connection mechanism preserves multi-level details and improves boundary delineation accuracy. Experimental results demonstrate that AFTNet outperforms comparative models, achieving superior performance with IoU (80.34%), Dice (81.76%), and Hausdorff Distance (21.54%). Furthermore, by incorporating data augmentation strategies, the model shows robust adaptability to chest X-ray images under different clinical conditions, enabling accurate boundary detection of fibrotic regions and quantifying fibrosis extent. The proposed AFTNet provides an efficient, accurate, and clinically valuable solution for the quantitative analysis of pulmonary fibrosis, contributing to improved diagnostic efficiency and treatment monitoring in pulmonary diseases.
TRS: Transformers for Remote Sensing Scene Classification
Remote sensing scene classification remains challenging due to the complexity and variety of scenes. With the development of attention-based methods, Convolutional Neural Networks (CNNs) have achieved competitive performance in remote sensing scene classification tasks. As an important method of the attention-based model, the Transformer has achieved great success in the field of natural language processing. Recently, the Transformer has been used for computer vision tasks. However, most existing methods divide the original image into multiple patches and encode the patches as the input of the Transformer, which limits the model’s ability to learn the overall features of the image. In this paper, we propose a new remote sensing scene classification method, Remote Sensing Transformer (TRS), a powerful “pure CNNs → Convolution + Transformer → pure Transformers” structure. First, we integrate self-attention into ResNet in a novel way, using our proposed Multi-Head Self-Attention layer instead of 3 × 3 spatial revolutions in the bottleneck. Then we connect multiple pure Transformer encoders to further improve the representation learning performance completely depending on attention. Finally, we use a linear classifier for classification. We train our model on four public remote sensing scene datasets: UC-Merced, AID, NWPU-RESISC45, and OPTIMAL-31. The experimental results show that TRS exceeds the state-of-the-art methods and achieves higher accuracy.
Text Sentiment Classification Based on BERT Embedding and Sliced Multi-Head Self-Attention Bi-GRU
In the task of text sentiment analysis, the main problem that we face is that the traditional word vectors represent lack of polysemy, the Recurrent Neural Network cannot be trained in parallel, and the classification accuracy is not high. We propose a sentiment classification model based on the proposed Sliced Bidirectional Gated Recurrent Unit (Sliced Bi-GRU), Multi-head Self-Attention mechanism, and Bidirectional Encoder Representations from Transformers embedding. First, the word vector representation obtained by the BERT pre-trained language model is used as the embedding layer of the neural network. Then the input sequence is sliced into subsequences of equal length. And the Bi-sequence Gated Recurrent Unit is applied to extract the subsequent feature information. The relationship between words is learned sequentially via the Multi-head Self-attention mechanism. Finally, the emotional tendency of the text is output by the Softmax function. Experiments show that the classification accuracy of this model on the Yelp 2015 dataset and the Amazon dataset is 74.37% and 62.57%, respectively. And the training speed of the model is better than most existing models, which verifies the effectiveness of the model.
A Hybrid Deep Learning Approach for Bearing Fault Diagnosis Using Continuous Wavelet Transform and Attention-Enhanced Spatiotemporal Feature Extraction
This study presents a hybrid deep learning approach for bearing fault diagnosis that integrates continuous wavelet transform (CWT) with an attention-enhanced spatiotemporal feature extraction framework. The model combines time-frequency domain analysis using CWT with a classification architecture comprising multi-head self-attention (MHSA), bidirectional long short-term memory (BiLSTM), and a 1D convolutional residual network (1D conv ResNet). This architecture effectively captures both spatial and temporal dependencies, enhances noise resilience, and extracts discriminative features from nonstationary and nonlinear vibration signals. The model is initially trained on a controlled laboratory bearing dataset and further validated on real and artificial subsets of the Paderborn bearing dataset, demonstrating strong generalization across diverse fault conditions. t-SNE visualizations confirm clear separability between fault categories, supporting the model’s capability for precise and reliable feature learning and strong potential for real-time predictive maintenance in complex industrial environments.
SE-VisionTransformer: Hybrid Network for Diagnosing Sugarcane Leaf Diseases Based on Attention Mechanism
Sugarcane is an important raw material for sugar and chemical production. However, in recent years, various sugarcane diseases have emerged, severely impacting the national economy. To address the issue of identifying diseases in sugarcane leaf sections, this paper proposes the SE-VIT hybrid network. Unlike traditional methods that directly use models for classification, this paper compares threshold, K-means, and support vector machine (SVM) algorithms for extracting leaf lesions from images. Due to SVM’s ability to accurately segment these lesions, it is ultimately selected for the task. The paper introduces the SE attention module into ResNet-18 (CNN), enhancing the learning of inter-channel weights. After the pooling layer, multi-head self-attention (MHSA) is incorporated. Finally, with the inclusion of 2D relative positional encoding, the accuracy is improved by 5.1%, precision by 3.23%, and recall by 5.17%. The SE-VIT hybrid network model achieves an accuracy of 97.26% on the PlantVillage dataset. Additionally, when compared to four existing classical neural network models, SE-VIT demonstrates significantly higher accuracy and precision, reaching 89.57% accuracy. Therefore, the method proposed in this paper can provide technical support for intelligent management of sugarcane plantations and offer insights for addressing plant diseases with limited datasets.
SMGformer: integrating STL and multi-head self-attention in deep learning model for multi-step runoff forecasting
Accurate runoff forecasting is of great significance for water resource allocation flood control and disaster reduction. However, due to the inherent strong randomness of runoff sequences, this task faces significant challenges. To address this challenge, this study proposes a new SMGformer runoff forecast model. The model integrates Seasonal and Trend decomposition using Loess (STL), Informer’s Encoder layer, Bidirectional Gated Recurrent Unit (BiGRU), and Multi-head self-attention (MHSA). Firstly, in response to the nonlinear and non-stationary characteristics of the runoff sequence, the STL decomposition is used to extract the runoff sequence’s trend, period, and residual terms, and a multi-feature set based on ‘sequence-sequence’ is constructed as the input of the model, providing a foundation for subsequent models to capture the evolution of runoff. The key features of the input set are then captured using the Informer’s Encoder layer. Next, the BiGRU layer is used to learn the temporal information of these features. To further optimize the output of the BiGRU layer, the MHSA mechanism is introduced to emphasize the impact of important information. Finally, accurate runoff forecasting is achieved by transforming the output of the MHSA layer through the Fully connected layer. To verify the effectiveness of the proposed model, monthly runoff data from two hydrological stations in China are selected, and eight models are constructed to compare the performance of the proposed model. The results show that compared with the Informer model, the 1th step MAE of the SMGformer model decreases by 42.2% and 36.6%, respectively; RMSE decreases by 37.9% and 43.6% respectively; NSE increases from 0.936 to 0.975 and from 0.487 to 0.837, respectively. In addition, the KGE of the SMGformer model at the 3th step are 0.960 and 0.805, both of which can maintain above 0.8. Therefore, the model can accurately capture key information in the monthly runoff sequence and extend the effective forecast period of the model.
A semi-supervised approach for the integration of multi-omics data based on transformer multi-head self-attention mechanism and graph convolutional networks
Background and objectives Comprehensive analysis of multi-omics data is crucial for accurately formulating effective treatment plans for complex diseases. Supervised ensemble methods have gained popularity in recent years for multi-omics data analysis. However, existing research based on supervised learning algorithms often fails to fully harness the information from unlabeled nodes and overlooks the latent features within and among different omics, as well as the various associations among features. Here, we present a novel multi-omics integrative method MOSEGCN, based on the Transformer multi-head self-attention mechanism and Graph Convolutional Networks(GCN), with the aim of enhancing the accuracy of complex disease classification. MOSEGCN first employs the Transformer multi-head self-attention mechanism and Similarity Network Fusion (SNF) to separately learn the inherent correlations of latent features within and among different omics, constructing a comprehensive view of diseases. Subsequently, it feeds the learned crucial information into a self-ensembling Graph Convolutional Network (SEGCN) built upon semi-supervised learning methods for training and testing, facilitating a better analysis and utilization of information from multi-omics data to achieve precise classification of disease subtypes. Results The experimental results show that MOSEGCN outperforms several state-of-the-art multi-omics integrative analysis approaches on three types of omics data: mRNA expression data, microRNA expression data, and DNA methylation data, with accuracy rates of 83.0% for Alzheimer's disease and 86.7% for breast cancer subtyping. Furthermore, MOSEGCN exhibits strong generalizability on the GBM dataset, enabling the identification of important biomarkers for related diseases. Conclusion MOSEGCN explores the significant relationship information among different omics and within each omics' latent features, effectively leveraging labeled and unlabeled information to further enhance the accuracy of complex disease classification. It also provides a promising approach for identifying reliable biomarkers, paving the way for personalized medicine.
Multi-view knowledge representation learning for personalized news recommendation
In the rapidly evolving field of personalized news recommendation, capturing and effectively utilizing user interests remains a significant challenge due to the vast diversity and dynamic nature of user interactions with news content. Existing recommendation models often fail to fully integrate candidate news items into user interest modeling, which can result in suboptimal recommendation accuracy and relevance. This limitation stems from their insufficient ability to jointly consider user history and the characteristics of candidate news items in the modeling process. To address this challenges, we propose the Multi-view Knowledge Representation Learning (MKRL) framework for personalized news recommendation, which leverages a multi-view news encoder and candidate-aware attention mechanisms to enhance user interest modeling. Unlike traditional methods, MKRL incorporates candidate news articles directly into the user interest modeling process, enabling the model to better understand and predict user preferences based on both historical behavior and potential new content. This is achieved through a sophisticated architecture that blends a multi-view news encoder and candidate-aware attention mechanisms, which together capture a more holistic and dynamic view of user interests. The MKRL framework innovatively integrates convolutional neural networks with multi-head attention modules to capture intricate contextual information from both user history and candidate news, allowing the model to recognize fine-grained patterns. The multi-head attention dynamically weighs user interactions and candidate news based on relevance, enhancing recommendation accuracy. Additionally, MKRL's multi-view approach represents news from different perspectives, enabling richer and more personalized recommendations. Extensive experiments on three real-world datasets demonstrate that our proposed framework outperforms state-of-the-art baselines in recommendation accuracy, validating its effectiveness.
MCL-DTI: using drug multimodal information and bi-directional cross-attention learning method for predicting drug–target interaction
Background Prediction of drug–target interaction (DTI) is an essential step for drug discovery and drug reposition. Traditional methods are mostly time-consuming and labor-intensive, and deep learning-based methods address these limitations and are applied to engineering. Most of the current deep learning methods employ representation learning of unimodal information such as SMILES sequences, molecular graphs, or molecular images of drugs. In addition, most methods focus on feature extraction from drug and target alone without fusion learning from drug–target interacting parties, which may lead to insufficient feature representation. Motivation In order to capture more comprehensive drug features, we utilize both molecular image and chemical features of drugs. The image of the drug mainly has the structural information and spatial features of the drug, while the chemical information includes its functions and properties, which can complement each other, making drug representation more effective and complete. Meanwhile, to enhance the interactive feature learning of drug and target, we introduce a bidirectional multi-head attention mechanism to improve the performance of DTI. Results To enhance feature learning between drugs and targets, we propose a novel model based on deep learning for DTI task called MCL-DTI which uses multimodal information of drug and learn the representation of drug–target interaction for drug–target prediction. In order to further explore a more comprehensive representation of drug features, this paper first exploits two multimodal information of drugs, molecular image and chemical text, to represent the drug. We also introduce to use bi-rectional multi-head corss attention (MCA) method to learn the interrelationships between drugs and targets. Thus, we build two decoders, which include an multi-head self attention (MSA) block and an MCA block, for cross-information learning. We use a decoder for the drug and target separately to obtain the interaction feature maps. Finally, we feed these feature maps generated by decoders into a fusion block for feature extraction and output the prediction results. Conclusions MCL-DTI achieves the best results in all the three datasets: Human, C. elegans and Davis, including the balanced datasets and an unbalanced dataset. The results on the drug–drug interaction (DDI) task show that MCL-DTI has a strong generalization capability and can be easily applied to other tasks.