Catalogue Search | MBRL

Using deep learning to diagnose retinal diseases through medical image analysis

by Kaibassova, Dinara , Yussupova, Gulbakhar , Muratbekova, Svetlana

2024

The scientific article focuses on the application of deep learning through simple U-Net, attention U-Net, residual U-Net, and residual attention U-Net models for diagnosing retinal diseases based on medical image analysis. The work includes a thorough analysis of each model's ability to detect retinal pathologies, taking into account their unique characteristics such as attention mechanisms and residual connections. The obtained experimental results confirm the high accuracy and reliability of the proposed models, emphasizing their potential as effective tools for automated diagnosis of retinal diseases based on medical images. This approach opens up new prospects for improving diagnostic procedures and increasing the efficiency of medical practice. The authors of the article propose an innovative method that can significantly facilitate the process of identifying retinal diseases, which is critical for early diagnosis and timely treatment. The results of the study support the prospect of using these models in clinical practice, highlighting their ability to accurately analyze medical images and improve the quality of eye health care.

Journal Article

Share this book

Add to My Shelf

Landslide detection in the Himalayas using machine learning algorithms and U-Net

by Singh, Ramesh P , Soares, Lucas Pedrosa , Grohmann, Carlos H in Accuracy , Algorithms , Automation

2022

Event-based landslide inventories are essential sources to broaden our understanding of the causal relationship between triggering events and the occurring landslides. Moreover, detailed inventories are crucial for the succeeding phases of landslide risk studies like susceptibility and hazard assessment. The openly available inventories differ in the quality and completeness levels. Event-based landslide inventories are created based on manual interpretation, and there can be significant differences in the mapping preferences among interpreters. To address this issue, we used two different datasets to analyze the potential of U-Net and machine learning approaches for automated landslide detection in the Himalayas. Dataset-1 is composed of five optical bands from the RapidEye satellite imagery. Dataset-2 is composed of the RapidEye optical data, and ALOS-PALSAR derived topographical data. We used a small dataset consisting of 239 samples acquired from several training zones and one testing zone to evaluate our models’ performance using the fully convolutional U-Net model, Support Vector Machines (SVM), K-Nearest Neighbor, and the Random Forest (RF). We created thirty-two different maps to evaluate and understand the implications of different sample patch sizes and their effect on the accuracy of landslide detection in the study area. The results were then compared against the manually interpreted inventory compiled using fieldwork and visual interpretation of the RapidEye satellite image. We used accuracy assessment metrics such as F1-score, Precision, Recall, and Mathews Correlation Coefficient (MCC). In the context of the Nepali Himalayas, employing RapidEye images and machine learning models, a viable patch size was investigated. The U-Net model trained with 128 × 128 pixel patch size yields the best MCC results (76.59%) with the dataset-1. The added information from the digital elevation model benefited the overall detection of landslides. However, it does not improve the model’s overall accuracy but helps differentiate human settlement areas and river sand bars. In this study, the U-Net achieved slightly better results than other machine learning approaches. Although it can depend on architecture of the U-Net model and the complexity of the geographical features in the imagery, the U-Net model is still preliminary in the domain of landslide detection. There is very little literature available related to the use of U-Net for landslide detection. This study is one of the first efforts of using U-Net for landslide detection in the Himalayas. Nevertheless, U-Net has the potential to improve further automated landslide detection in the future for varied topographical and geomorphological scenes.

Journal Article

Share this book

Add to My Shelf

Explainable AI in Scene Understanding for Autonomous Vehicles in Unstructured Traffic Environments on Indian Roads Using the Inception U-Net Model with Grad-CAM Visualization

by Gite, Shilpa , Pradhan, Biswajeet , Alamri, Abdullah in Algorithms , Architecture , Artificial Intelligence

2022

The intelligent transportation system, especially autonomous vehicles, has seen a lot of interest among researchers owing to the tremendous work in modern artificial intelligence (AI) techniques, especially deep neural learning. As a result of increased road accidents over the last few decades, significant industries are moving to design and develop autonomous vehicles. Understanding the surrounding environment is essential for understanding the behavior of nearby vehicles to enable the safe navigation of autonomous vehicles in crowded traffic environments. Several datasets are available for autonomous vehicles focusing only on structured driving environments. To develop an intelligent vehicle that drives in real-world traffic environments, which are unstructured by nature, there should be an availability of a dataset for an autonomous vehicle that focuses on unstructured traffic environments. Indian Driving Lite dataset (IDD-Lite), focused on an unstructured driving environment, was released as an online competition in NCPPRIPG 2019. This study proposed an explainable inception-based U-Net model with Grad-CAM visualization for semantic segmentation that combines an inception-based module as an encoder for automatic extraction of features and passes to a decoder for the reconstruction of the segmentation feature map. The black-box nature of deep neural networks failed to build trust within consumers. Grad-CAM is used to interpret the deep-learning-based inception U-Net model to increase consumer trust. The proposed inception U-net with Grad-CAM model achieves 0.622 intersection over union (IoU) on the Indian Driving Dataset (IDD-Lite), outperforming the state-of-the-art (SOTA) deep neural-network-based segmentation models.

Journal Article

Share this book

Add to My Shelf

On the Exploration of Automatic Building Extraction from RGB Satellite Images Using Deep Learning Architectures Based on U-Net

by Doulamis, Anastasios , Temenos, Anastasios , Temenos, Nikos in Artificial neural networks , attention residual U-Net , attention U-Net

2022

Detecting and localizing buildings is of primary importance in urban planning tasks. Automating the building extraction process, however, has become attractive given the dominance of Convolutional Neural Networks (CNNs) in image classification tasks. In this work, we explore the effectiveness of the CNN-based architecture U-Net and its variations, namely, the Residual U-Net, the Attention U-Net, and the Attention Residual U-Net, in automatic building extraction. We showcase their robustness in feature extraction and information processing using exclusively RGB images, as they are a low-cost alternative to multi-spectral and LiDAR ones, selected from the SpaceNet 1 dataset. The experimental results show that U-Net achieves a 91.9% accuracy, whereas introducing residual blocks, attention gates, or a combination of both improves the accuracy of the vanilla U-Net to 93.6%, 94.0%, and 93.7%, respectively. Finally, the comparison between U-Net architectures and typical deep learning approaches from the literature highlights their increased performance in accurate building localization around corners and edges.

Journal Article

Share this book

Add to My Shelf

Convolutional neural network for automated mass segmentation in mammography

by Bi, Jinbo , Abdelhafiz, Dina , Nabavi, Sheida in Algorithms , Artificial neural networks , Automation

2020

Background Automatic segmentation and localization of lesions in mammogram (MG) images are challenging even with employing advanced methods such as deep learning (DL) methods. We developed a new model based on the architecture of the semantic segmentation U-Net model to precisely segment mass lesions in MG images. The proposed end-to-end convolutional neural network (CNN) based model extracts contextual information by combining low-level and high-level features. We trained the proposed model using huge publicly available databases, (CBIS-DDSM, BCDR-01, and INbreast), and a private database from the University of Connecticut Health Center (UCHC). Results We compared the performance of the proposed model with those of the state-of-the-art DL models including the fully convolutional network (FCN), SegNet, Dilated-Net, original U-Net, and Faster R-CNN models and the conventional region growing (RG) method. The proposed Vanilla U-Net model outperforms the Faster R-CNN model significantly in terms of the runtime and the Intersection over Union metric (IOU). Training with digitized film-based and fully digitized MG images, the proposed Vanilla U-Net model achieves a mean test accuracy of 92.6%. The proposed model achieves a mean Dice coefficient index (DI) of 0.951 and a mean IOU of 0.909 that show how close the output segments are to the corresponding lesions in the ground truth maps. Data augmentation has been very effective in our experiments resulting in an increase in the mean DI and the mean IOU from 0.922 to 0.951 and 0.856 to 0.909, respectively. Conclusions The proposed Vanilla U-Net based model can be used for precise segmentation of masses in MG images. This is because the segmentation process incorporates more multi-scale spatial context, and captures more local and global context to predict a precise pixel-wise segmentation map of an input full MG image. These detected maps can help radiologists in differentiating benign and malignant lesions depend on the lesion shapes. We show that using transfer learning, introducing augmentation, and modifying the architecture of the original model results in better performance in terms of the mean accuracy, the mean DI, and the mean IOU in detecting mass lesion compared to the other DL and the conventional models.

Journal Article

Share this book

Add to My Shelf

Defect Detection of Subway Tunnels Using Advanced U-Net Network

by Togo, Ren , Wang, An , Ogawa, Takahiro in Accuracy , Algorithms , Cracks

2022

In this paper, we present a novel defect detection model based on an improved U-Net architecture. As a semantic segmentation task, the defect detection task has the problems of background–foreground imbalance, multi-scale targets, and feature similarity between the background and defects in the real-world data. Conventionally, general convolutional neural network (CNN)-based networks mainly focus on natural image tasks, which are insensitive to the problems in our task. The proposed method has a network design for multi-scale segmentation based on the U-Net architecture including an atrous spatial pyramid pooling (ASPP) module and an inception module, and can detect various types of defects compared to conventional simple CNN-based methods. Through the experiments using a real-world subway tunnel image dataset, the proposed method showed a better performance than that of general semantic segmentation including state-of-the-art methods. Additionally, we showed that our method can achieve excellent detection balance among multi-scale defects.

Journal Article

Share this book

Add to My Shelf

EAAU-Net: Enhanced Asymmetric Attention U-Net for Infrared Small Target Detection

by Su, Shaojing , Zuo, Zhen , Sun, Bei in Ablation , artificial intelligence , Artificial neural networks

2021

Detecting infrared small targets lacking texture and shape information in cluttered environments is extremely challenging. With the development of deep learning, convolutional neural network (CNN)-based methods have achieved promising results in generic object detection. However, existing CNN-based methods with pooling layers may lose the targets in the deep layers and, thus, cannot be directly applied for infrared small target detection. To overcome this problem, we propose an enhanced asymmetric attention (EAA) U-Net. Specifically, we present an efficient and powerful EAA module that uses both same-layer feature information exchange and cross-layer feature fusion to improve feature representation. In the proposed approach, spatial and channel information exchanges occur between the same layers to reinforce the primitive features of small targets, and a bottom-up global attention module focuses on cross-layer feature fusion to enable the dynamic weighted modulation of high-level features under the guidance of low-level features. The results of detailed ablation studies empirically validate the effectiveness of each component in the network architecture. Compared to state-of-the-art methods, the proposed method achieved superior performance, with an intersection-over-union (IoU) of 0.771, normalised IoU (nIoU) of 0.746, and F-area of 0.681 on the publicly available SIRST dataset.

Journal Article

Share this book

Add to My Shelf

Sharp U-Net: Depthwise convolutional network for biomedical image segmentation

by Ben Hamza, A. , Zunair, Hasib in Coders , Computer architecture , Coronaviruses

2021

The U-Net architecture, built upon the fully convolutional network, has proven to be effective in biomedical image segmentation. However, U-Net applies skip connections to merge semantically different low- and high-level convolutional features, resulting in not only blurred feature maps, but also over- and under-segmented target regions. To address these limitations, we propose a simple, yet effective end-to-end depthwise encoder-decoder fully convolutional network architecture, called Sharp U-Net, for binary and multi-class biomedical image segmentation. The key rationale of Sharp U-Net is that instead of applying a plain skip connection, a depthwise convolution of the encoder feature map with a sharpening kernel filter is employed prior to merging the encoder and decoder features, thereby producing a sharpened intermediate feature map of the same size as the encoder map. Using this sharpening filter layer, we are able to not only fuse semantically less dissimilar features, but also to smooth out artifacts throughout the network layers during the early stages of training. Our extensive experiments on six datasets show that the proposed Sharp U-Net model consistently outperforms or matches the recent state-of-the-art baselines in both binary and multi-class segmentation tasks, while adding no extra learnable parameters. Furthermore, Sharp U-Net outperforms baselines that have more than three times the number of learnable parameters. •We introduce a novel Sharp U‐Net architecture by designing new connections between the encoder and decoder subnetworks using a depthwise convolution of the encoder feature maps with a sharpening spatial filter to address the semantic gap issue between the encoder and decoder features.•We show that the Sharp U‐Net architecture can be scaled for improved performance, outperforming baselines that have three times the number of learnable parameters.•We demonstrate through extensive experiments the ability of the proposed model to learn efficient representations for both binary and multi‐class segmentation tasks on a variety of medical images from different modalities.

Journal Article

Share this book

Add to My Shelf

Urban Land Use and Land Cover Classification Using Novel Deep Learning Models Based on High Spatial Resolution Satellite Imagery

by Wang, Mingli , Zhang, Shuangyue , Ke, Yinghai in deep learning , high spatial resolution satellite imagery , U-Net

2018

Urban land cover and land use mapping plays an important role in urban planning and management. In this paper, novel multi-scale deep learning models, namely ASPP-Unet and ResASPP-Unet are proposed for urban land cover classification based on very high resolution (VHR) satellite imagery. The proposed ASPP-Unet model consists of a contracting path which extracts the high-level features, and an expansive path, which up-samples the features to create a high-resolution output. The atrous spatial pyramid pooling (ASPP) technique is utilized in the bottom layer in order to incorporate multi-scale deep features into a discriminative feature. The ResASPP-Unet model further improves the architecture by replacing each layer with residual unit. The models were trained and tested based on WorldView-2 (WV2) and WorldView-3 (WV3) imageries over the city of Beijing. Model parameters including layer depth and the number of initial feature maps (IFMs) as well as the input image bands were evaluated in terms of their impact on the model performances. It is shown that the ResASPP-Unet model with 11 layers and 64 IFMs based on 8-band WV2 imagery produced the highest classification accuracy (87.1% for WV2 imagery and 84.0% for WV3 imagery). The ASPP-Unet model with the same parameter setting produced slightly lower accuracy, with overall accuracy of 85.2% for WV2 imagery and 83.2% for WV3 imagery. Overall, the proposed models outperformed the state-of-the-art models, e.g., U-Net, convolutional neural network (CNN) and Support Vector Machine (SVM) model over both WV2 and WV3 images, and yielded robust and efficient urban land cover classification results.

Journal Article

Share this book

Add to My Shelf

Analysis of AI-Based Single-View 3D Reconstruction Methods for an Industrial Application

by Dold, Patricia M. , Heizmann, Michael , Hartung, Julia in Algorithms , Artificial Intelligence , Cameras

2022

Machine learning (ML) is a key technology in smart manufacturing as it provides insights into complex processes without requiring deep domain expertise. This work deals with deep learning algorithms to determine a 3D reconstruction from a single 2D grayscale image. The potential of 3D reconstruction can be used for quality control because the height values contain relevant information that is not visible in 2D data. Instead of 3D scans, estimated depth maps based on a 2D input image can be used with the advantage of a simple setup and a short recording time. Determining a 3D reconstruction from a single input image is a difficult task for which many algorithms and methods have been proposed in the past decades. In this work, three deep learning methods, namely stacked autoencoder (SAE), generative adversarial networks (GANs) and U-Nets are investigated, evaluated and compared for 3D reconstruction from a 2D grayscale image of laser-welded components. In this work, different variants of GANs are tested, with the conclusion that Wasserstein GANs (WGANs) are the most robust approach among them. To the best of our knowledge, the present paper considers for the first time the U-Net, which achieves outstanding results in semantic segmentation, in the context of 3D reconstruction tasks. Unlike the U-Net, which uses standard convolutions, the stacked dilated U-Net (SDU-Net) applies stacked dilated convolutions. Of all the 3D reconstruction approaches considered in this work, the SDU-Net shows the best performance, not only in terms of evaluation metrics but also in terms of computation time. Due to the comparably small number of trainable parameters and the suitability of the architecture for strong data augmentation, a robust model can be generated with only a few training data.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter