Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
8,136 result(s) for "Attention mechanism"
Sort by:
Hydroformer: Frequency Domain Enhanced Multi‐Attention Transformer for Monthly Lake Level Reconstruction With Low Data Input Requirements
Lake level changes are critical indicators of hydrological balance and climate change, yet long‐term monthly lake level reconstruction is challenging with incomplete or short‐term data. Data‐driven models, while promising, struggle with nonstationary lake level changes and complex dependencies on meteorological factors, limiting their applicability. Here, we introduce the Hydroformer, a frequency domain enhanced multi‐attention Transformer model designed for monthly lake level reconstruction, utilizing reanalysis data. This model features two innovative mechanisms: (a) Frequency‐Enhanced Attention (FEA) for capturing long‐term temporal dependence, and (b) Causality‐based Cross‐dimensional Attention (CCA) to elucidate how specific meteorological factors influence lake level. Seasonal and trend patterns of catchment meteorological factors and lake level are initially identified by a time series decomposition block, then independently learned and refined within the model. Tested across 50 lakes globally, the Hydroformer excelled in reconstruction periods ranging from half to three times the training‐test length. The model exhibited good performance even when training data missing rates were below 50%, particularly in lakes with significant seasonal fluctuations. The Hydroformer demonstrated robust generalization across lakes of varying sizes, from 10.11 to 18,135 km2, with median values for R2, MAE, MSE, and RMSE at 0.813, 0.313, 0.215, and 0.4, respectively. Furthermore, the Hydroformer outperformed data‐driven models, improving MSE by 29.2% and MAE by 24.4% compared to the next best model, the FEDformer. Our method proposes a novel approach for reconstructing long‐term water level changes and managing lake resources under climate change. Plain Language Summary Lake water levels, as key indicators of hydrologic dynamics and catchment balance, are vital for understanding climate impacts and managing water resources. However, the lack of continuous measurements for most global lakes, combined with the inability of traditional data‐driven models to effectively decipher complex interactions with catchment hydrological processes, leads to significant gaps in generalizability, accuracy, and reconstructive length. Given these limitations, accurate monthly reconstructions of lake level remain a persistent challenge. To address this, we develop Hydroformer, an innovative frequency domain enhanced multi‐attention Transformer model, utilizing reanalysis data for monthly lake level reconstruction. It employs two innovative attention mechanisms: Frequency‐Enhanced Attention for capturing long‐term temporal dependencies and Causality‐based Cross‐dimensional Attention for cross‐dimensional causal dependencies between catchment meteorological factors and lake level. Through a decomposition block, the model efficiently recognizes and refines inherent seasonal and trend patterns, leading to a comprehensive understanding of lake behaviors. Through testing on 50 global lakes, the Hydroformer has exhibited exceptional performance in reconstructing water levels for lakes ranging from 10.11 to 18,135 km2, adeptly handling short‐term, long‐term, and varying proportions of data gaps. It notably outperforms supervised data‐driven models. This positions it as a vital instrument for monthly lake level reconstruction, showcasing the power of integrating advanced artificial intelligence techniques in hydrological modeling. Key Points A novel frequency domain enhanced multi‐attention Transformer model, Hydroformer, has been built for reconstructing monthly lake level using reanalysis data The model accurately extends reconstructions 2–3 times the training data length, excelling with less than 50% missing training data Hydroformer surpasses advanced AI‐based models, improving MSE and MAE by over 20% and demonstrating strong generalization across lakes of varying sizes
Classification of Hyperspectral Image Based on Double-Branch Dual-Attention Mechanism Network
In recent years, researchers have paid increasing attention on hyperspectral image (HSI) classification using deep learning methods. To improve the accuracy and reduce the training samples, we propose a double-branch dual-attention mechanism network (DBDA) for HSI classification in this paper. Two branches are designed in DBDA to capture plenty of spectral and spatial features contained in HSI. Furthermore, a channel attention block and a spatial attention block are applied to these two branches respectively, which enables DBDA to refine and optimize the extracted feature maps. A series of experiments on four hyperspectral datasets show that the proposed framework has superior performance to the state-of-the-art algorithm, especially when the training samples are signally lacking.
Target Recognition Based on Infrared and Visible Image Fusion and Improved YOLOv8 Algorithm
In response to the issue that the fusion process of infrared and visible images is easily affected by lighting factors, in this paper, we propose an adaptive illumination perception fusion mechanism, which was integrated into an infrared and visible image fusion network. Spatial attention mechanisms were applied to both infrared images and visible images for feature extraction. Deep convolutional neural networks were utilized for further feature information extraction. The adaptive illumination perception fusion mechanism is then integrated into the image reconstruction process to reduce the impact of lighting variations in the fused images. A Median Strengthening Channel and Spatial Attention Module (MSCS) was designed to be integrated into the backbone of YOLOv8. In this paper, we used the fusion network to create a dataset named ivifdata for training the target recognition network. The experimental results indicated that the improved YOLOv8 network saw further enhancements of 2.3%, 1.4%, and 8.2% in the Recall, mAP50, and mAP50-95 metrics, respectively. The experiments revealed that the improved YOLOv8 network has advantages in terms of recognition rate and completeness, while also reducing the rates of false negatives and false positives.
MCL-DTI: using drug multimodal information and bi-directional cross-attention learning method for predicting drug–target interaction
Background Prediction of drug–target interaction (DTI) is an essential step for drug discovery and drug reposition. Traditional methods are mostly time-consuming and labor-intensive, and deep learning-based methods address these limitations and are applied to engineering. Most of the current deep learning methods employ representation learning of unimodal information such as SMILES sequences, molecular graphs, or molecular images of drugs. In addition, most methods focus on feature extraction from drug and target alone without fusion learning from drug–target interacting parties, which may lead to insufficient feature representation. Motivation In order to capture more comprehensive drug features, we utilize both molecular image and chemical features of drugs. The image of the drug mainly has the structural information and spatial features of the drug, while the chemical information includes its functions and properties, which can complement each other, making drug representation more effective and complete. Meanwhile, to enhance the interactive feature learning of drug and target, we introduce a bidirectional multi-head attention mechanism to improve the performance of DTI. Results To enhance feature learning between drugs and targets, we propose a novel model based on deep learning for DTI task called MCL-DTI which uses multimodal information of drug and learn the representation of drug–target interaction for drug–target prediction. In order to further explore a more comprehensive representation of drug features, this paper first exploits two multimodal information of drugs, molecular image and chemical text, to represent the drug. We also introduce to use bi-rectional multi-head corss attention (MCA) method to learn the interrelationships between drugs and targets. Thus, we build two decoders, which include an multi-head self attention (MSA) block and an MCA block, for cross-information learning. We use a decoder for the drug and target separately to obtain the interaction feature maps. Finally, we feed these feature maps generated by decoders into a fusion block for feature extraction and output the prediction results. Conclusions MCL-DTI achieves the best results in all the three datasets: Human, C. elegans and Davis, including the balanced datasets and an unbalanced dataset. The results on the drug–drug interaction (DDI) task show that MCL-DTI has a strong generalization capability and can be easily applied to other tasks.
ARM-Net: A Tri-Phase Integrated Network for Hyperspectral Image Compression
Most current hyperspectral image compression methods rely on well-designed modules to capture image structural information and long-range dependencies. However, these modules tend to increase computational complexity exponentially with the number of bands, which limits their performance under constrained resources. To address these challenges, this paper proposes a novel triple-phase hybrid framework for hyperspectral image compression. The first stage utilizes an adaptive band selection technique to sample the raw hyperspectral image, which mitigates the computational burden. The second stage concentrates on high-fidelity compression, efficiently encoding both spatial and spectral information within the sampled band clusters. In the final stage, a reconstruction network compensates for sampling-induced losses to precisely restore the original spectral details. The proposed framework, known as ARM-Net, is evaluated on seven mixed hyperspectral datasets. Compared to state-of-the-art methods, ARM-Net achieves an overall improvement of approximately 1–2 dB in both the peak signal-to-noise ratio and multiscale structural similarity index measure, as well as a reduction in the average spectral angle mapper of approximately 0.1.
A Multiple Attention Convolutional Neural Networks for Diesel Engine Fault Diagnosis
Fault diagnosis can improve the safety and reliability of diesel engines. An end-to-end method based on a multi-attention convolutional neural network (MACNN) is proposed for accurate and efficient diesel engine fault diagnosis. By optimizing the arrangement and kernel size of the channel and spatial attention modules, the feature extraction capability is improved, and an improved convolutional block attention module (ICBAM) is obtained. Vibration signal features are acquired using a feature extraction model alternating between the convolutional neural network (CNN) and ICBAM. The feature map is recombined to reconstruct the sequence order information. Next, the self-attention mechanism (SAM) is applied to learn the recombined sequence features directly. A Swish activation function is introduced to solve “Dead ReLU” and improve the accuracy. A dynamic learning rate curve is designed to improve the convergence ability of the model. The diesel engine fault simulation experiment is carried out to simulate three kinds of fault types (abnormal valve clearance, abnormal rail pressure, and insufficient fuel supply), and each kind of fault varies in different degrees. The comparison results show that the accuracy of MACNN on the eight-class fault dataset at different speeds is more than 97%. The testing time of the MACNN is much less than the machine running time (for one work cycle). Therefore, the proposed end-to-end fault diagnosis method has a good application prospect.
A Spatial Feature-Enhanced Attention Neural Network with High-Order Pooling Representation for Application in Pest and Disease Recognition
With the development of advanced information and intelligence technologies, precision agriculture has become an effective solution to monitor and prevent crop pests and diseases. However, pest and disease recognition in precision agriculture applications is essentially the fine-grained image classification task, which aims to learn effective discriminative features that can identify the subtle differences among similar visual samples. It is still challenging to solve for existing standard models troubled by oversized parameters and low accuracy performance. Therefore, in this paper, we propose a feature-enhanced attention neural network (Fe-Net) to handle the fine-grained image recognition of crop pests and diseases in innovative agronomy practices. This model is established based on an improved CSP-stage backbone network, which offers massive channel-shuffled features in various dimensions and sizes. Then, a spatial feature-enhanced attention module is added to exploit the spatial interrelationship between different semantic regions. Finally, the proposed Fe-Net employs a higher-order pooling module to mine more highly representative features by computing the square root of the covariance matrix of elements. The whole architecture is efficiently trained in an end-to-end way without additional manipulation. With comparative experiments on the CropDP-181 Dataset, the proposed Fe-Net achieves Top-1 Accuracy up to 85.29% with an average recognition time of only 71 ms, outperforming other existing methods. More experimental evidence demonstrates that our approach obtains a balance between the model’s performance and parameters, which is suitable for its practical deployment in precision agriculture art applications.
An enhanced deep learning-based framework for diagnosing apple leaf diseases
Timely and correct identification of diseases in the apple leaf is also important in protecting crop production and sustaining agriculture. This paper introduces E-YOLOv8, a lightweight improved version of YOLOv8, that can be implemented in real-time and with a limited resource base. The model has three key contributions: (1) GhostConv and C3 fusion to reduce redundant feature extraction and computational cost, (2) CBAM attention and a specifically designed FPN to maximize multi-scale feature fusion and small-lesion detections, and (3) large-scale evaluation on datasets of apple leaf disease, as well as ablation experiments and operational testing on edge devices to verify the accuracy and viability of this model. In experiments, E-YOLOv8 reaches 93.9mAP0.5 using 5.3 GFLOPs and 1.8 M parameters, a 33.9x factor smaller than that of YOLOv8l. These results indicate that E-YOLOv8 has achieved better performance than recent state-of-the-art detectors and is still applicable to practical real-world agricultural tasks.
Study on Detection and Recognition of Traffic Lights Based on Improved YOLOv4
To resolve the issues of a deep backbone network, a large model, slow reasoning speed on a mobile terminal, low detection accuracy for small targets and difficulties detecting and recognizing traffic lights in real time and accurately with YOLOv4, a traffic lights recognition method based on improved YOLOv4 is proposed. The lightweight ShuffleNetv2 network is utilized to replace the CSPDarkNet53 network of YOLOv4 to satisfy the requirements of a mobile terminal. The reformed k-means clustering algorithm is applied to generate anchor boxes for avoiding the sensitivity issue of outliers and initial values. A novel attention mechanism named CS2A is added to enhance the extraction capability of effective features. Multiple data augmentation methods are combined to improve the generalization ability of the model. Ultimately, the detection and recognition of traffic lights can be realized. The S2TLD dataset is selected for training and testing, and it can be proved that the recognition accuracy and model size are greatly optimized. Meanwhile, a self-made dataset is selected for training and testing. Compared with the conventional YOLOv4, the recognition accuracy of the proposed algorithm for traffic lights’ state information increases by 1.79%, and the model size decreases by 81.97%. Appropriate scenes are selected for real-vehicle testing and the results demonstrate that the detection speed of the presented algorithm increases by 16%, and the recognition effect for small targets increases by 37% in comparison with conventional YOLOv4.