Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
8 result(s) for "Mamba U-Net"
Sort by:
A Mamba U-Net Model for Reconstruction of Extremely Dark RGGB Images
Currently, most images captured by high-pixel devices such as mobile phones, camcorders, and drones are in RGGB format. However, image quality in extremely dark scenes often needs improvement. Traditional methods for processing these dark RGGB images typically rely on end-to-end U-Net networks and their enhancement techniques, which require substantial resources and processing time. To tackle this issue, we first converted RGGB images into RGB three-channel images by subtracting the black level and applying linear interpolation. During the training stage, we leveraged the computational efficiency of the state-space model (SSM) and developed a Mamba U-Net end-to-end model to enhance the restoration of extremely dark RGGB images. We utilized the see-in-the-dark (SID) dataset for training, assessing the effectiveness of our approach. Experimental results indicate that our method significantly reduces resource consumption compared to existing single-step training and prior multi-step training techniques, while achieving improved peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) outcomes.
RM-UNet: UNet-like Mamba with rotational SSM module for medical image segmentation
Accurate segmentation of tissues and lesions is crucial for disease diagnosis, treatment planning, and surgical navigation. Yet, the complexity of medical images presents significant challenges for traditional Convolutional Neural Networks and Transformer models due to their limited receptive fields or high computational complexity. State Space Models (SSMs) have recently shown notable vision performance, particularly Mamba and its variants. However, their feature extraction methods may not be sufficiently effective and retain some redundant structures, leaving room for parameter reduction. In response to these challenges, we introduce a methodology called Rotational Mamba-UNet, characterized by Residual Visual State Space (ResVSS) block and Rotational SSM Module. The ResVSS block is devised to mitigate network degradation caused by the diminishing efficacy of information transfer from shallower to deeper layers. Meanwhile, the Rotational SSM Module is devised to tackle the challenges associated with channel feature extraction within State Space Models. Finally, we propose a weighted multi-level loss function, which fully leverages the outputs of the decoder’s three stages for supervision. We conducted experiments on ISIC17, ISIC18, CVC-300, Kvasir-SEG, CVC-ColonDB, Kvasir-Instrument datasets, and Low-grade Squamous Intraepithelial Lesion datasets provided by The Third Affiliated Hospital of Sun Yat-sen University, demonstrating the superior segmentation performance of our proposed RM-UNet. Additionally, compared to the previous VM-UNet, our model achieves a one-third reduction in parameters. Our code is available at https://github.com/Halo2Tang/RM-UNet.
Symmetric Boundary-Enhanced U-Net with Mamba Architecture for Glomerular Segmentation in Renal Pathological Images
Accurate glomerular segmentation in renal pathological images is a key challenge for chronic kidney disease diagnosis and assessment. Due to the high visual similarity between pathological glomeruli and surrounding tissues in color, texture, and morphology, significant “camouflage phenomena” exist, leading to boundary identification difficulties. To address this problem, we propose BM-UNet, a novel segmentation framework that embeds boundary guidance mechanisms into a Mamba architecture with a symmetric encoder–decoder design. The framework enhances feature transmission through explicit boundary detection, incorporating four core modules designed for key challenges in pathological image segmentation. The Multi-scale Adaptive Fusion (MAF) module processes irregular tissue morphology, the Hybrid Boundary Detection (HBD) module handles boundary feature extraction, the Boundary-guided Attention (BGA) module achieves boundary-aware feature refinement, and the Mamba-based Fused Decoder Block (MFDB) completes boundary-preserving reconstruction. By introducing explicit boundary supervision mechanisms, the framework achieves significant segmentation accuracy improvements while maintaining linear computational complexity. Validation on the KPIs2024 glomerular dataset and HuBMAP renal tissue samples demonstrates that BM-UNet achieves a 92.4–95.3% mean Intersection over Union across different CKD pathological conditions, with a 4.57% improvement over the Mamba baseline and a processing speed of 113.7 FPS.
FFM-Net: Fusing Frequency Selection Information with Mamba for Skin Lesion Segmentation
Accurate segmentation of lesion regions is essential for skin cancer diagnosis. As dermoscopic images of skin lesions demonstrate different sizes, diverse shapes, fuzzy boundaries, and so on, accurate segmentation still faces great challenges. To address these issues, we propose a new dermatologic image segmentation network, FFM-Net. In FFM-Net, we design a new FM block encoder based on state space models (SSMs), which integrates a low-frequency information extraction module (LEM) and an edge detail extraction module (EEM) to extract broader overall structural information and more accurate edge detail information, respectively. At the same time, we dynamically adjust the input channel ratios of the two module branches at different stages of our network, so that the model can learn the correlation relationship between the overall structure and edge detail features more effectively. Furthermore, we designed the cross-channel spatial attention (CCSA) module to improve the model’s sensitivity to channel and spatial dimensions. We deploy a multi-level feature fusion module (MFFM) at the bottleneck layer to aggregate rich multi-scale contextual representations. Finally, we conducted extensive experiments on three publicly available skin lesion segmentation datasets, ISIC2017, ISIC2018, and PH2, and the experimental results show that the FFM-Net model outperforms most existing skin lesion segmentation methods.
RegMamba: An Improved Mamba for Medical Image Registration
Deformable medical image registration aims to minimize the differences between fixed and moving images to provide comprehensive physiological or structural information for further medical analysis. Traditional learning-based convolutional network approaches usually suffer from the problem of perceptual limitations, and in recent years, the Transformer architecture has gained popularity for its superior long-range relational modeling capabilities, but still faces severe computational challenges in handling high-resolution medical images. Recently, selective state-space models have shown great potential in the vision domain due to their fast inference and efficient modeling. Inspired by this, in this paper, we propose RegMamba, a novel medical image registration architecture that combines convolutional and state-space models (SSMs), designed to efficiently capture complex correspondence in registration while maintaining efficient computational effort. Firstly our model introduces Mamba to efficiently remotely model and process potential dependencies of the data to capture large deformations. At the same time, we use a scaled convolutional layer in Mamba to alleviate the problem of spatial information loss in 3D data flattening processing in Mamba. Then, a deformable convolutional residual module (DCRM) is proposed to adaptively adjust the sampling position and process deformations to capture more flexible spatial features while learning fine-grained features of different anatomical structures to construct local correspondences and improve model perception. We demonstrate the advanced registration performance of our method on the LPBA40 and IXI public datasets.
TUMamba: A novel tongue segment methods based on Mamba and U-Net
Background and Objective Current tongue segmentation methods often struggle with extracting global features and performing selective filtering, particularly in complex environments where background objects resemble the tongue. These challenges significantly reduce segmentation efficiency. To address these issues, this article proposes a novel model for tongue segmentation in complex environments, combining Mamba and U-Net. By leveraging Mamba’s global feature selection capabilities, this model assists U-Net in accurately excluding tongue-like objects from the background, thereby enhancing segmentation accuracy and efficiency. Methods To improved the segmentation accuracy of the U-Net backbone model, we incorporated the Mamba attention module along with a multi-stage feature fusion module. The Mamba attention module serially connects spatial and channel attention mechanisms at the U-Net ’s skip connections, selectively filtering the feature maps passed into the deep network. Additionally, the multi-stage feature fusion module integrates feature maps from different stages, further improving segmentation performance. Results Compared with state-of-the-art semantic segmentation and tongue segmentation models, our model improved the mean intersection over union by 1.17%. Ablation experiments further demonstrated that each module proposed in this study contributes to enhancing the model’s segmentation efficiency. Conclusion This study constructs a Tongue segmentation model based on U-Net and Mamba (TUMamba). The model effectively extracted global spatial and channel features using the Mamba attention module, captured local detail features through U-Net, and enhanced image features via multi-stage feature fusion. The results demonstrate that the model performs exceptionally well in tongue segmentation tasks, proving its value in handling complex environments.
Speech Separation Using Advanced Deep Neural Network Methods: A Recent Survey
Speech separation, as an important research direction in audio signal processing, has been widely studied by the academic community since its emergence in the mid-1990s. In recent years, with the rapid development of deep neural network technology, speech processing based on deep neural networks has shown outstanding performance in speech separation. While existing studies have surveyed the application of deep neural networks in speech separation from multiple dimensions including learning paradigms, model architectures, loss functions, and training strategies, current achievements still lack systematic comprehension of the field’s developmental trajectory. To address this, this paper focuses on single-channel supervised speech separation tasks, proposing a technological evolution path “U-Net–TasNet–Transformer–Mamba” as the main thread to systematically analyze the impact mechanisms of core architectural designs on separation performance across different stages. By reviewing the transition process from traditional methods to deep learning paradigms and delving into the improvements and integration of deep learning architectures at various stages, this paper summarizes milestone achievements, mainstream evaluation frameworks, and typical datasets in the field, while also providing prospects for future research directions. Through this detailed-focused review perspective, we aim to provide researchers in the speech separation field with a clearly articulated technical evolution map and practical reference.
VML-UNet: Fusing Vision Mamba and Lightweight Attention Mechanism for Skin Lesion Segmentation
Deep learning has advanced medical image segmentation, yet existing methods struggle with complex anatomical structures. Mainstream models, such as CNN, Transformer, and hybrid architectures, face challenges including insufficient information representation and redundant complexity, which limit their clinical deployment. Developing efficient and lightweight networks is crucial for accurate lesion localization and optimized clinical workflows. We propose the VML-UNet, a lightweight segmentation network with core innovations including the CPMamba module and the multi-scale local supervision module (MLSM). The CPMamba module integrates the visual state space (VSS) block and a channel prior attention mechanism to enable efficient modeling of spatial relationships with linear computational complexity through dynamic channel-space weight allocation, while preserving channel feature integrity. The MLSM enhances local feature perception and reduces the inference burden. Comparative experiments were conducted on three public datasets, including ISIC2017, ISIC2018, and PH2, with ablation experiments performed on ISIC2017. VML-UNet achieves 0.53 M parameters, 2.18 MB memory usage, and 1.24 GFLOPs time complexity, with its performance on the datasets outperforming comparative networks, validating its effectiveness. This study provides valuable references for developing lightweight, high-performance skin lesion segmentation networks, advancing the field of skin lesion segmentation.