Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
2,579 result(s) for "convolutional attention"
Sort by:
Improving Malaria diagnosis through interpretable customized CNNs architectures
Malaria, which is spread via female Anopheles mosquitoes and is brought on by the Plasmodium parasite, persists as a serious illness, especially in areas with a high mosquito density. Traditional detection techniques, like examining blood samples with a microscope, tend to be labor-intensive, unreliable and necessitate specialized individuals. To address these challenges, we employed several customized convolutional neural networks (CNNs), including Parallel convolutional neural network (PCNN), Soft Attention Parallel Convolutional Neural Networks (SPCNN), and Soft Attention after Functional Block Parallel Convolutional Neural Networks (SFPCNN), to improve the effectiveness of malaria diagnosis. Among these, the SPCNN emerged as the most successful model, outperforming all other models in evaluation metrics. The SPCNN achieved a precision of 99.38 0.21%, recall of 99.37 0.21%, F1 score of 99.37 0.21%, accuracy of 99.37 ± 0.30%, and an area under the receiver operating characteristic curve (AUC) of 99.95 ± 0.01%, demonstrating its robustness in detecting malaria parasites. Furthermore, we employed various transfer learning (TL) algorithms, including VGG16, ResNet152, MobileNetV3Small, EfficientNetB6, EfficientNetB7, DenseNet201, Vision Transformer (ViT), Data-efficient Image Transformer (DeiT), ImageIntern, and Swin Transformer (versions v1 and v2). The proposed SPCNN model surpassed all these TL methods in every evaluation measure. The SPCNN model, with 2.207 million parameters and a size of 26 MB, is more complex than PCNN but simpler than SFPCNN. Despite this, SPCNN exhibited the fastest testing times (0.00252 s), making it more computationally efficient than both PCNN and SFPCNN. We assessed model interpretability using feature activation maps, Gradient-weighted Class Activation Mapping (Grad-CAM) and SHapley Additive exPlanations (SHAP) visualizations for all three architectures, illustrating why SPCNN outperformed the others. The findings from our experiments show a significant improvement in malaria parasite diagnosis. The proposed approach outperforms traditional manual microscopy in terms of both accuracy and speed. This study highlights the importance of utilizing cutting-edge technologies to develop robust and effective diagnostic tools for malaria prevention.
Multi-scale conv-attention U-Net for medical image segmentation
U-Net-based network structures are widely used in medical image segmentation. However, effectively capturing multi-scale features and spatial context information of complex organizational structures remains a challenge. To address this, we propose a novel network structure based on the U-Net backbone. This model integrates the Adaptive Convolution (AC) module, Multi-Scale Learning (MSL) module, and Conv-Attention module to enhance feature expression ability and segmentation performance. The AC module dynamically adjusts the convolutional kernel through an adaptive convolutional layer. This enables the model to extract features of different shapes and scales adaptively, further improving its performance in complex scenarios. The MSL module is designed for multi-scale information fusion. It effectively aggregates fine-grained and high-level semantic features from different resolutions, creating rich multi-scale connections between the encoding and decoding processes. On the other hand, the Conv-Attention module incorporates an efficient attention mechanism into the skip connections. It captures global context information using a low-dimensional proxy for high-dimensional data. This approach reduces computational complexity while maintaining effective spatial and channel information extraction. Experimental validation on the CVC-ClinicDB, MICCAI 2023 Tooth, and ISIC2017 datasets demonstrates that our proposed MSCA-UNet significantly improves segmentation accuracy and model robustness. At the same time, it remains lightweight and outperforms existing segmentation methods.
Efficient Gearbox Fault Diagnosis Based on Improved Multi-Scale CNN with Lightweight Convolutional Attention
As a core transmission component of modern industrial equipment, the operation status of the gearbox has a significant impact on the reliability and service life of major machinery. In this paper, we propose an intelligent diagnosis framework based on Empirical Mode Decomposition and multimodal feature co-optimization and innovatively construct a fault diagnosis model by fusing a multi-scale convolutional neural network and a lightweight convolutional attention model. The framework extracts the multi-band features of vibration signals through the improved multi-scale convolutional neural network, which significantly enhances adaptability to complex working conditions (variable rotational speed, strong noise); at the same time, the lightweight convolutional attention mechanism is used to replace the multi-attention of the traditional Transformer, which greatly reduces computational complexity while guaranteeing accuracy and realizes highly efficient, lightweight local–global feature modeling. The lightweight convolutional attention is adaptively captured by the dynamic convolutional kernel generation strategy to adaptively capture local features in the time domain, and combined with grouped convolution to enhance the computational efficiency further; in addition, parameterized revised linear units are introduced to retain fault-sensitive negative information, which enhances the model’s ability to detect weak faults. The experimental findings demonstrate that the proposed model achieves an accuracy greater than 98.9%, highlighting its exceptional diagnostic accuracy and robustness. Moreover, compared to other fault diagnosis methods, the model exhibits superior performance under complex working conditions.
Infrared Image Object Detection Algorithm for Substation Equipment Based on Improved YOLOv8
Substations play a crucial role in the proper operation of power systems. Online fault diagnosis of substation equipment is critical for improving the safety and intelligence of power systems. Detecting the target equipment from an infrared image of substation equipment constitutes a pivotal step in online fault diagnosis. To address the challenges of missed detection, false detection, and low detection accuracy in the infrared image object detection in substation equipment, this paper proposes an infrared image object detection algorithm for substation equipment based on an improved YOLOv8n. Firstly, the DCNC2f module is built by combining deformable convolution with the C2f module, and the C2f module in the backbone is replaced by the DCNC2f module to enhance the ability of the model to extract relevant equipment features. Subsequently, the multi-scale convolutional attention module is introduced to improve the ability of the model to capture multi-scale information and enhance detection accuracy. The experimental results on the infrared image dataset of the substation equipment demonstrate that the improved YOLOv8n model achieves mAP@0.5 and mAP@0.5:0.95 of 92.7% and 68.5%, respectively, representing a 2.6% and 3.9% improvement over the baseline model. The improved model significantly enhances object detection accuracy and exhibits superior performance in infrared image object detection in substation equipment.
ParMamba: A Parallel Architecture Using CNN and Mamba for Brain Tumor Classification
Brain tumors, one of the most lethal diseases with low survival rates, require early detection and accurate diagnosis to enable effective treatment planning. While deep learning architectures, particularly Convolutional Neural Networks (CNNs), have shown significant performance improvements over traditional methods, they struggle to capture the subtle pathological variations between different brain tumor types. Recent attention-based models have attempted to address this by focusing on global features, but they come with high computational costs. To address these challenges, this paper introduces a novel parallel architecture, ParMamba, which uniquely integrates Convolutional Attention Patch Embedding (CAPE) and the ConvMamba block including CNN, Mamba and the channel enhancement module, marking a significant advancement in the field. The unique design of ConvMamba block enhances the ability of model to capture both local features and long-range dependencies, improving the detection of subtle differences between tumor types. The channel enhancement module refines feature interactions across channels. Additionally, CAPE is employed as a downsampling layer that extracts both local and global features, further improving classification accuracy. Experimental results on two publicly available brain tumor datasets demonstrate that ParMamba achieves classification accuracies of 99.62% and 99.35%, outperforming existing methods. Notably, ParMamba surpasses vision transformers (ViT) by 1.37% in accuracy, with a throughput improvement of over 30%. These results demonstrate that ParMamba delivers superior performance while operating faster than traditional attention-based methods.
A Fault Diagnosis Method for Pump Station Units Based on CWT-MHA-CNN Model for Sustainable Operation of Inter-Basin Water Transfer Projects
Inter-basin water transfer projects are core infrastructure for achieving sustainable water resource allocation and addressing regional water scarcity, and pumping station units, as their critical energy-consuming and operation-controlling components, are vital to the projects’ sustainable performance. With the growing complexity and scale of these projects, pumping station units have become more intricate, leading to a gradual rise in failure rates. However, existing fault diagnosis methods are relatively backward, failing to promptly detect potential faults—this not only threatens operational safety but also undermines sustainable development goals: equipment failures cause excessive energy consumption (violating energy efficiency requirements for sustainability), unplanned downtime disrupts stable water supply (impairing reliable water resource access), and even leads to water waste or environmental risks. To address this sustainability-oriented challenge, this paper focuses on the fault characteristics of pumping station units and proposes a comprehensive and accurate fault diagnosis model, aiming to enhance the sustainability of water transfer projects through technical optimization. The model utilizes advanced algorithms and data processing technologies to accurately identify fault types, thereby laying a technical foundation for the low-energy, reliable, and sustainable operation of pumping stations. Firstly, continuous wavelet transform (CWT) converts one-dimensional time-domain signals into two-dimensional time-frequency graphs, visually displaying dynamic signal characteristics to capture early fault features that may cause energy waste. Next, the multi-head attention mechanism (MHA) segments the time-frequency graphs and correlates feature-location information via independent self-attention layers, accurately capturing the temporal correlation of fault evolution—this enables early fault warning to avoid prolonged inefficient operation and energy loss. Finally, the improved convolutional neural network (CNN) layer integrates feature information and temporal correlation, outputting predefined fault probabilities for accurate fault determination. Experimental results show the model effectively solves the difficulty of feature extraction in pumping station fault diagnosis, considers fault evolution timeliness, and significantly improves prediction accuracy and anti-noise performance. Comparative experiments with three existing methods verify its superiority. Critically, this model strengthens sustainability in three key ways: (1) early fault detection reduces unplanned downtime, ensuring stable water supply (a core sustainable water resource goal); (2) accurate fault localization cuts unnecessary maintenance energy consumption, aligning with energy-saving requirements; (3) reduced equipment failure risks minimize water waste and environmental impacts. Thus, it not only provides a new method for pumping station fault diagnosis but also offers technical support for the sustainable operation of water conservancy infrastructure, contributing to global sustainable development goals (SDGs) related to water and energy.
ROAST-IoT: A Novel Range-Optimized Attention Convolutional Scattered Technique for Intrusion Detection in IoT Networks
The Internet of Things (IoT) has significantly benefited several businesses, but because of the volume and complexity of IoT systems, there are also new security issues. Intrusion detection systems (IDSs) guarantee both the security posture and defense against intrusions of IoT devices. IoT systems have recently utilized machine learning (ML) techniques widely for IDSs. The primary deficiencies in existing IoT security frameworks are their inadequate intrusion detection capabilities, significant latency, and prolonged processing time, leading to undesirable delays. To address these issues, this work proposes a novel range-optimized attention convolutional scattered technique (ROAST-IoT) to protect IoT networks from modern threats and intrusions. This system uses the scattered range feature selection (SRFS) model to choose the most crucial and trustworthy properties from the supplied intrusion data. After that, the attention-based convolutional feed-forward network (ACFN) technique is used to recognize the intrusion class. In addition, the loss function is estimated using the modified dingo optimization (MDO) algorithm to ensure the maximum accuracy of classifier. To evaluate and compare the performance of the proposed ROAST-IoT system, we have utilized popular intrusion datasets such as ToN-IoT, IoT-23, UNSW-NB 15, and Edge-IIoT. The analysis of the results shows that the proposed ROAST technique did better than all existing cutting-edge intrusion detection systems, with an accuracy of 99.15% on the IoT-23 dataset, 99.78% on the ToN-IoT dataset, 99.88% on the UNSW-NB 15 dataset, and 99.45% on the Edge-IIoT dataset. On average, the ROAST-IoT system achieved a high AUC-ROC of 0.998, demonstrating its capacity to distinguish between legitimate data and attack traffic. These results indicate that the ROAST-IoT algorithm effectively and reliably detects intrusion attacks mechanism against cyberattacks on IoT systems.
A Remote Sensing Image Super-Resolution Reconstruction Model Combining Multiple Attention Mechanisms
Remote sensing images are characterized by high complexity, significant scale variations, and abundant details, which present challenges for existing deep learning-based super-resolution reconstruction methods. These algorithms often exhibit limited convolutional receptive fields and thus struggle to establish global contextual information, which can lead to an inadequate utilization of both global and local details and limited generalization capabilities. To address these issues, this study introduces a novel multi-branch residual hybrid attention block (MBRHAB). This innovative approach is part of a proposed super-resolution reconstruction model for remote sensing data, which incorporates various attention mechanisms to enhance performance. First, the model employs window-based multi-head self-attention to model long-range dependencies in images. A multi-branch convolution module (MBCM) is then constructed to enhance the convolutional receptive field for improved representation of global information. Convolutional attention is subsequently combined across channels and spatial dimensions to strengthen associations between different features and areas containing crucial details, thereby augmenting local semantic information. Finally, the model adopts a parallel design to enhance computational efficiency. Generalization performance was assessed using a cross-dataset approach involving two training datasets (NWPU-RESISC45 and PatternNet) and a third test dataset (UCMerced-LandUse). Experimental results confirmed that the proposed method surpassed the existing super-resolution algorithms, including Bicubic interpolation, SRCNN, ESRGAN, Real-ESRGAN, IRN, and DSSR in the metrics of PSNR and SSIM across various magnifications scales.
UNeXt: An Efficient Network for the Semantic Segmentation of High-Resolution Remote Sensing Images
The application of deep neural networks for the semantic segmentation of remote sensing images is a significant research area within the field of the intelligent interpretation of remote sensing data. The semantic segmentation of remote sensing images holds great practical value in urban planning, disaster assessment, the estimation of carbon sinks, and other related fields. With the continuous advancement of remote sensing technology, the spatial resolution of remote sensing images is gradually increasing. This increase in resolution brings about challenges such as significant changes in the scale of ground objects, redundant information, and irregular shapes within remote sensing images. Current methods leverage Transformers to capture global long-range dependencies. However, the use of Transformers introduces higher computational complexity and is prone to losing local details. In this paper, we propose UNeXt (UNet+ConvNeXt+Transformer), a real-time semantic segmentation model tailored for high-resolution remote sensing images. To achieve efficient segmentation, UNeXt uses the lightweight ConvNeXt-T as the encoder and a lightweight decoder, Transnext, which combines a Transformer and CNN (Convolutional Neural Networks) to capture global information while avoiding the loss of local details. Furthermore, in order to more effectively utilize spatial and channel information, we propose a SCFB (SC Feature Fuse Block) to reduce computational complexity while enhancing the model’s recognition of complex scenes. A series of ablation experiments and comprehensive comparative experiments demonstrate that our method not only runs faster than state-of-the-art (SOTA) lightweight models but also achieves higher accuracy. Specifically, our proposed UNeXt achieves 85.2% and 82.9% mIoUs on the Vaihingen and Gaofen5 (GID5) datasets, respectively, while maintaining 97 fps for 512 × 512 inputs on a single NVIDIA GTX 4090 GPU, outperforming other SOTA methods.
Improving dental disease diagnosis using a cross attention based hybrid model of DeiT and CoAtNet
Accurate dental diagnosis is essential for effective treatment planning and improving patient outcomes, particularly in identifying various dental diseases, such as cavities, fillings, implants, and impacted teeth. This study proposes a new hybrid model that integrates the strengths of the data-efficient image transformer (DeiT) and convolutional attention network (CoAtNet) to enhance diagnostic accuracy. Our approach’s first step involves preprocessing dental radiographic images to improve their quality and enhance feature extraction. The model employs a cross-attention fusion mechanism that aligns and merges feature representations from DeiT and CoAtNet, leveraging their unique capabilities to capture relevant patterns in the data. A stacking classifier, comprising base classifiers such as support vector machines (SVM), eXtreme gradient boosting (XGBoost), and multilayer perceptron (MLP), optimizes classification performance by combining predictions from multiple models. The proposed model demonstrates superior performance, achieving an accuracy of 96%, a precision of 96.5%, 96.1% for sensitivity, 96.4% for specificity, and 96.3% for Dice similarity coefficient, thus showcasing its effectiveness in the automatic diagnosis of dental diseases.