Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
44 result(s) for "depth-wise separable convolution"
Sort by:
SwinBTS: A Method for 3D Multimodal Brain Tumor Segmentation Using Swin Transformer
Brain tumor semantic segmentation is a critical medical image processing work, which aids clinicians in diagnosing patients and determining the extent of lesions. Convolutional neural networks (CNNs) have demonstrated exceptional performance in computer vision tasks in recent years. For 3D medical image tasks, deep convolutional neural networks based on an encoder–decoder structure and skip-connection have been frequently used. However, CNNs have the drawback of being unable to learn global and remote semantic information well. On the other hand, the transformer has recently found success in natural language processing and computer vision as a result of its usage of a self-attention mechanism for global information modeling. For demanding prediction tasks, such as 3D medical picture segmentation, local and global characteristics are critical. We propose SwinBTS, a new 3D medical picture segmentation approach, which combines a transformer, convolutional neural network, and encoder–decoder structure to define the 3D brain tumor semantic segmentation job as a sequence-to-sequence prediction challenge in this research. To extract contextual data, the 3D Swin Transformer is utilized as the network’s encoder and decoder, and convolutional operations are employed for upsampling and downsampling. Finally, we achieve segmentation results using an improved Transformer module that we built for increasing detail feature extraction. Extensive experimental results on the BraTS 2019, BraTS 2020, and BraTS 2021 datasets reveal that SwinBTS outperforms state-of-the-art 3D algorithms for brain tumor segmentation on 3D MRI scanned images.
A Multi-Scale U-Shaped Convolution Auto-Encoder Based on Pyramid Pooling Module for Object Recognition in Synthetic Aperture Radar Images
Although unsupervised representation learning (RL) can tackle the performance deterioration caused by limited labeled data in synthetic aperture radar (SAR) object classification, the neglected discriminative detailed information and the ignored distinctive characteristics of SAR images can lead to performance degradation. In this paper, an unsupervised multi-scale convolution auto-encoder (MSCAE) was proposed which can simultaneously obtain the global features and local characteristics of targets with its U-shaped architecture and pyramid pooling modules (PPMs). The compact depth-wise separable convolution and the deconvolution counterpart were devised to decrease the trainable parameters. The PPM and the multi-scale feature learning scheme were designed to learn multi-scale features. Prior knowledge of SAR speckle was also embedded in the model. The reconstruction loss of the MSCAE was measured by the structural similarity index metric (SSIM) of the reconstructed data and the images filtered by the improved Lee sigma filter. A speckle suppression restriction was also added in the objective function to guarantee that the speckle suppression procedure would take place in the feature learning stage. Experimental results with the MSTAR dataset under the standard operating condition and several extended operating conditions demonstrated the effectiveness of the proposed model in SAR object classification tasks.
Explainable deep learning model for automatic mulberry leaf disease classification
Mulberry leaves feed Bombyx mori silkworms to generate silk thread. Diseases that affect mulberry leaves have reduced crop and silk yields in sericulture, which produces 90% of the world’s raw silk. Manual leaf disease identification is tedious and error-prone. Computer vision can categorize leaf diseases early and overcome the challenges of manual identification. No mulberry leaf deep learning (DL) models have been reported. Therefore, in this study, two types of leaf diseases: leaf rust and leaf spot, with disease-free leaves, were collected from two regions of Bangladesh. Sericulture experts annotated the leaf images. The images were pre-processed, and 6,000 synthetic images were generated using typical image augmentation methods from the original 764 training images. Additional 218 and 109 images were employed for testing and validation respectively. In addition, a unique lightweight parallel depth-wise separable CNN model, PDS-CNN was developed by applying depth-wise separable convolutional layers to reduce parameters, layers, and size while boosting classification performance. Finally, the explainable capability of PDS-CNN is obtained through the use of SHapley Additive exPlanations (SHAP) evaluated by a sericulture specialist. The proposed PDS-CNN outperforms well-known deep transfer learning models, achieving an optimistic accuracy of 95.05 ± 2.86% for three-class classifications and 96.06 ± 3.01% for binary classifications with only 0.53 million parameters, 8 layers, and a size of 6.3 megabytes. Furthermore, when compared with other well-known transfer models, the proposed model identified mulberry leaf diseases with higher accuracy, fewer factors, fewer layers, and lower overall size. The visually expressive SHAP explanation images validate the models’ findings aligning with the predictions made the sericulture specialist. Based on these findings, it is possible to conclude that the explainable AI (XAI)-based PDS-CNN can provide sericulture specialists with an effective tool for accurately categorizing mulberry leaves.
Depth-Wise Separable Convolution Neural Network with Residual Connection for Hyperspectral Image Classification
The neural network-based hyperspectral images (HSI) classification model has a deep structure, which leads to the increase of training parameters, long training time, and excessive computational cost. The deepened network models are likely to cause the problem of gradient disappearance, which limits further improvement for its classification accuracy. To this end, a residual unit with fewer training parameters were constructed by combining the residual connection with the depth-wise separable convolution. With the increased depth of the network, the number of output channels of each residual unit increases linearly with a small amplitude. The deepened network can continuously extract the spectral and spatial features while building a cone network structure by stacking the residual units. At the end of executing the model, a 1 × 1 convolution layer combined with a global average pooling layer can be used to replace the traditional fully connected layer to complete the classification with reduced parameters needed in the network. Experiments were conducted on three benchmark HSI datasets: Indian Pines, Pavia University, and Kennedy Space Center. The overall classification accuracy was 98.85%, 99.58%, and 99.96% respectively. Compared with other classification methods, the proposed network model guarantees a higher classification accuracy while spending less time on training and testing sample sites.
Lightweight and Efficient Tiny-Object Detection Based on Improved YOLOv8n for UAV Aerial Images
The task of multiple-tiny-object detection from diverse perspectives in unmanned aerial vehicles (UAVs) using onboard edge devices is a significant and complex challenge within computer vision. In order to address this challenge, we propose a lightweight and efficient tiny-object-detection algorithm named LE-YOLO, based on the YOLOv8n architecture. To improve the detection performance and optimize the model efficiency, we present the LHGNet backbone, a more extensive feature extraction network, integrating depth-wise separable convolution and channel shuffle modules. This integration facilitates a thorough exploration of the inherent features within the network at deeper layers, promoting the fusion of local detail information and channel characteristics. Furthermore, we introduce the LGS bottleneck and LGSCSP fusion module incorporated into the neck, aiming to decrease the computational complexity while preserving the detector’s accuracy. Additionally, we enhance the detection accuracy by modifying its structure and the size of the feature maps. These improvements significantly enhance the model’s capability to capture tiny objects. The proposed LE-YOLO detector is examined in ablation and comparative experiments on the VisDrone2019 dataset. In contrast to YOLOv8n, the proposed LE-YOLO model achieved a 30.0% reduction in the parameter count, accompanied by a 15.9% increase in the mAP(0.5). These comprehensive experiments indicate that our approach can significantly enhance the detection accuracy and optimize the model efficiency through the organic combination of our suggested enhancements.
Depth-Wise Separable Convolution Attention Module for Garbage Image Classification
Currently, how to deal with the massive garbage produced by various human activities is a hot topic all around the world. In this paper, a preliminary and essential step is to classify the garbage into different categories. However, the mainstream waste classification mode relies heavily on manual work, which consumes a lot of labor and is very inefficient. With the rapid development of deep learning, convolutional neural networks (CNN) have been successfully applied to various application fields. Therefore, some researchers have directly adopted CNNs to classify garbage through their images. However, compared with other images, the garbage images have their own characteristics (such as inter-class similarity, intra-class variance and complex background). Thus, neglecting these characteristics would impair the classification accuracy of CNN. To overcome the limitations of existing garbage image classification methods, a Depth-wise Separable Convolution Attention Module (DSCAM) is proposed in this paper. In DSCAM, the inherent relationships of channels and spatial positions in garbage image features are captured by two attention modules with depth-wise separable convolutions, so that our method could only focus on important information and ignore the interference. Moreover, we also adopt a residual network as the backbone of DSCAM to enhance its discriminative ability. We conduct the experiments on five garbage datasets. The experimental results demonstrate that the proposed method could effectively classify the garbage images and that it outperforms some classical methods.
Image segmentation and coverage estimation of deep-sea polymetallic nodules based on lightweight deep learning model
Deep-sea polymetallic nodules, abundant in critical metal elements, are a vital strategic mineral resource. Accordingly, the prompt, accurate, and high-speed acquisition of parameters and distribution data for these nodules is crucial for the effective exploration, evaluation, and identification of valuable deposits. Studies show that one of the primary parameters for assessing polymetallic nodules is the Coverage Rate. For real-time, accurate, and efficient computation of this parameter, this article proposes a streamlined segmentation model named YOLOv7-PMN. This model is particularly designed for analyzing seafloor video data. The model substitutes the YOLOv7 backbone with the lightweight feature extraction framework of MobileNetV3-Small and integrates multi-level Squeeze-and-Excitation attention mechanisms. These changes enhance detection accuracy, speed up inference, and reduce the model’s overall size. The head network utilizes depth-wise separable convolution modules, significantly decreasing the number of model parameters. Compared to the original YOLOv7, the YOLOv7-PMN shows improved detection and segmentation performance for nodules of varying sizes. On the same dataset, the recall rate for nodules increases by 3% over the YOLOv7 model. Model parameters are cut by 61.78%, memory usage by the best weights is reduced by 61.15%, and inference speed for detection and segmentation rises to 65.79 FPS, surpassing the 25 FPS video capture rate. The model demonstrates strong generalization capabilities, lowering the requirements for video data quality and reducing dependency on extensive dataset annotations. In summary, YOLOv7-PMN is highly effective in processing seabed images of polymetallic nodules, which are characterized by varying target scales, complex environments, and diverse features. This model holds significant promise for practical application and broad adoption.
A Lightweight and Accurate UAV Detection Method Based on YOLOv4
At present, the UAV (Unmanned Aerial Vehicle) has been widely used both in civilian and military fields. Most of the current object detection algorithms used to detect UAVs require more parameters, and it is difficult to achieve real-time performance. In order to solve this problem while ensuring a high accuracy rate, we further lighten the model and reduce the number of parameters of the model. This paper proposes an accurate and lightweight UAV detection model based on YOLOv4. To verify the effectiveness of this model, we made a UAV dataset, which contains four types of UAVs and 20,365 images. Through comparative experiments and optimization of existing deep learning and object detection algorithms, we found a lightweight model to achieve an efficient and accurate rapid detection of UAVs. First, from the comparison of the one-stage method and the two-stage method, it is concluded that the one-stage method has better real-time performance and considerable accuracy in detecting UAVs. Then, we further compared the one-stage methods. In particular, for YOLOv4, we replaced MobileNet with its backbone network, modified the feature extraction network, and replaced standard convolution with depth-wise separable convolution, which greatly reduced the parameters and realized 82 FPS and 93.52% mAP while ensuring high accuracy and taking into account the real-time performance.
EARDS: EfficientNet and attention-based residual depth-wise separable convolution for joint OD and OC segmentation
Glaucoma is the leading cause of irreversible vision loss. Accurate Optic Disc (OD) and Optic Cup (OC) segmentation is beneficial for glaucoma diagnosis. In recent years, deep learning has achieved remarkable performance in OD and OC segmentation. However, OC segmentation is more challenging than OD segmentation due to its large shape variability and cryptic boundaries that leads to performance degradation when applying the deep learning models to segment OC. Moreover, the OD and OC are segmented independently, or pre-requirement is necessary to extract the OD centered region with pre-processing procedures. In this paper, we suggest a one-stage network named EfficientNet and Attention-based Residual Depth-wise Separable Convolution (EARDS) for joint OD and OC segmentation. In EARDS, EfficientNet-b0 is regarded as an encoder to capture more effective boundary representations. To suppress irrelevant regions and highlight features of fine OD and OC regions, Attention Gate (AG) is incorporated into the skip connection. Also, Residual Depth-wise Separable Convolution (RDSC) block is developed to improve the segmentation performance and computational efficiency. Further, a novel decoder network is proposed by combining AG, RDSC block and Batch Normalization (BN) layer, which is utilized to eliminate the vanishing gradient problem and accelerate the convergence speed. Finally, the focal loss and dice loss as a weighted combination is designed to guide the network for accurate OD and OC segmentation. Extensive experimental results on the Drishti-GS and REFUGE datasets indicate that the proposed EARDS outperforms the state-of-the-art approaches. The code is available at https://github.com/M4cheal/EARDS.
LCGSC-YOLO: a lightweight apple leaf diseases detection method based on LCNet and GSConv module under YOLO framework
In response to the current mainstream deep learning detection methods with a large number of learned parameters and the complexity of apple leaf disease scenarios, the paper proposes a lightweight method and names it LCGSC-YOLO. This method is based on the LCNet(A Lightweight CPU Convolutional Neural Network) and GSConv(Group Shuffle Convolution) module modified YOLO(You Only Look Once) framework. Firstly, the lightweight LCNet is utilized to reconstruct the backbone network, with the purpose of reducing the number of parameters and computations of the model. Secondly, the GSConv module and the VOVGSCSP (Slim-neck by GSConv) module are introduced in the neck network, which makes it possible to minimize the number of model parameters and computations while guaranteeing the fusion capability among the different feature layers. Finally, coordinate attention is embedded in the tail of the backbone and after each VOVGSCSP module to improve the problem of detection accuracy degradation issue caused by model lightweighting. The experimental results show the LCGSC-YOLO can achieve an excellent detection performance with mean average precision of 95.5% and detection speed of 53 frames per second (FPS) on the mixed datasets of Plant Pathology 2021 (FGVC8) and AppleLeaf9. The number of parameters and Floating Point Operations (FLOPs) of the LCGSC-YOLO are much less thanother related comparative experimental algorithms.