Catalogue Search | MBRL

Pathological-Gait Recognition Using Spatiotemporal Graph Convolutional Networks and Attention Model

by Seo, Haneol , Lee, Chan-Su , Naseem, Muhammad Tahir in Classification , Gait , gait classification

2022

Walking is an exercise that uses muscles and joints of the human body and is essential for understanding body condition. Analyzing body movements through gait has been studied and applied in human identification, sports science, and medicine. This study investigated a spatiotemporal graph convolutional network model (ST-GCN), using attention techniques applied to pathological-gait classification from the collected skeletal information. The focus of this study was twofold. The first objective was extracting spatiotemporal features from skeletal information presented by joint connections and applying these features to graph convolutional neural networks. The second objective was developing an attention mechanism for spatiotemporal graph convolutional neural networks, to focus on important joints in the current gait. This model establishes a pathological-gait-classification system for diagnosing sarcopenia. Experiments on three datasets, namely NTU RGB+D, pathological gait of GIST, and multimodal-gait symmetry (MMGS), validate that the proposed model outperforms existing models in gait classification.

Journal Article

Share this book

Add to My Shelf

Machine Learning Based Prediction of Squamous Cell Carcinoma in Ex Vivo Confocal Laser Scanning Microscopy

by Daniela Hartmann , Cristel Ruini , Benjamin Kendziora in Algorithms , Artificial intelligence , Big Data

2021

Image classification with convolutional neural networks (CNN) offers an unprecedented opportunity to medical imaging. Regulatory agencies in the USA and Europe have already cleared numerous deep learning/machine learning based medical devices and algorithms. While the field of radiology is on the forefront of artificial intelligence (AI) revolution, conventional pathology, which commonly relies on examination of tissue samples on a glass slide, is falling behind in leveraging this technology. On the other hand, ex vivo confocal laser scanning microscopy (ex vivo CLSM), owing to its digital workflow features, has a high potential to benefit from integrating AI tools into the assessment and decision-making process. Aim of this work was to explore a preliminary application of CNN in digitally stained ex vivo CLSM images of cutaneous squamous cell carcinoma (cSCC) for automated detection of tumor tissue. Thirty-four freshly excised tissue samples were prospectively collected and examined immediately after resection. After the histologically confirmed ex vivo CLSM diagnosis, the tumor tissue was annotated for segmentation by experts, in order to train the MobileNet CNN. The model was then trained and evaluated using cross validation. The overall sensitivity and specificity of the deep neural network for detecting cSCC and tumor free areas on ex vivo CLSM slides compared to expert evaluation were 0.76 and 0.91, respectively. The area under the ROC curve was equal to 0.90 and the area under the precision-recall curve was 0.85. The results demonstrate a high potential of deep learning models to detect cSCC regions on digitally stained ex vivo CLSM slides and to distinguish them from tumor-free skin.

Journal Article

Share this book

Add to My Shelf

Automatic Extraction of Water and Shadow from SAR Images Based on a Multi-Resolution Dense Encoder and Decoder Network

by Chen, Lifu , Zhang, Peng , Li, Zhenhong in Accuracy , Artificial intelligence , Classification

2019

The water and shadow areas in SAR images contain rich information for various applications, which cannot be extracted automatically and precisely at present. To handle this problem, a new framework called Multi-Resolution Dense Encoder and Decoder (MRDED) network is proposed, which integrates Convolutional Neural Network (CNN), Residual Network (ResNet), Dense Convolutional Network (DenseNet), Global Convolutional Network (GCN), and Convolutional Long Short-Term Memory (ConvLSTM). MRDED contains three parts: the Gray Level Gradient Co-occurrence Matrix (GLGCM), the Encoder network, and the Decoder network. GLGCM is used to extract low-level features, which are further processed by the Encoder. The Encoder network employs ResNet to extract features at different resolutions. There are two components of the Decoder network, namely, the Multi-level Features Extraction and Fusion (MFEF) and Score maps Fusion (SF). We implement two versions of MFEF, named MFEF1 and MFEF2, which generate separate score maps. The difference between them lies in that the Chained Residual Pooling (CRP) module is utilized in MFEF2, while ConvLSTM is adopted in MFEF1 to form the Improved Chained Residual Pooling (ICRP) module as the replacement. The two separate score maps generated by MFEF1 and MFEF2 are fused with different weights to produce the fused score map, which is further handled by the Softmax function to generate the final extraction results for water and shadow areas. To evaluate the proposed framework, MRDED is trained and tested with large SAR images. To further assess the classification performance, a total of eight different classification frameworks are compared with our proposed framework. MRDED outperformed by reaching 80.12% in Pixel Accuracy (PA) and 73.88% in Intersection of Union (IoU) for water, 88% in PA and 77.11% in IoU for shadow, and 95.16% in PA and 90.49% in IoU for background classification, respectively.

Journal Article

Share this book

Add to My Shelf

End-to-end video background subtraction with 3d convolutional neural networks

by Han, Jungong , Shao, Ling , Liu, Heng in Artificial neural networks , Change detection , Computer vision

2018

Background subtraction in videos is a highly challenging task by definition, as it lays on a pixel-wise classification level. Therefore, great attention to detail is essential. In this paper, we follow the success of Deep Learning in Computer Vision and present an end-to-end system for background subtraction in videos. Our model is able to track temporal changes in a video sequence by applying 3D convolutions to the most recent frames of the video. Thus, no background model is needed to be retained and updated. In addition, it can handle multiple scenes without further fine-tuning on each scene individually. We evaluate our system on the largest dataset for change detection, CDnet, with over 50 videos which span across 11 categories. Further evaluation is performed in the ESI dataset which features extreme and sudden illumination changes. Our model surpasses the state-of-the-art on both datasets according to the average ranking of the models over a wide range of metrics.

Journal Article

Share this book

Add to My Shelf

Rapid surrogate modelling of water-hammer in water distribution networks using spatiotemporal graph convolutional networks

by Kim, Hyunjun , Her, Younggu , Jeung, Minhyuk in Graph convolutional networks , pipe rupture , spatiotemporal graph convolutional networks

2026

Safe drinking water is essential to public health and social stability. Despite advances in modern materials and design, water distribution systems (WDSs) remain vulnerable to failure. Pipe ruptures are particularly consequential because they generate pressure surges (water hammer) that propagate through pipelines, jeopardising upstream and downstream assets. Although theory-driven or physics-based transient models can capture these dynamics, they can be computationally intensive and require specialised expertise, thereby limiting their routine use. Data-driven approaches, such as graph convolutional networks (GCNs), leverage spatial and temporal correlations and have performed well in networked domains such as traffic, finance, and social networks. In this study, we applied a spatiotemporal GCN (STGCN) to simulate water hammer responses under multiple pipe rupture scenarios. We chose STGCN because it explicitly embeds network connectivity and temporal dynamics, enabling topology-aware propagation of rupture signals at significantly lower computational costs than physics-based simulators. Across tests, the STGCN predictions closely matched the observed pressure histories, including peak timing, while substantially reducing the computational cost compared with theory-driven simulations. These results indicate that STGCN are efficient, accurate surrogates for simulating pressure wave dynamics in WDSs under rupture conditions, enabling rapid what-if analyses for utility applications.

Journal Article

Share this book

Add to My Shelf

Improving Malaria diagnosis through interpretable customized CNNs architectures

by Ahamed, Md. Faysal , Ayari, Mohamed Arselene , Khandakar, Amith in 631/114/1305 , 631/114/2398 , Algorithms

2025

Malaria, which is spread via female Anopheles mosquitoes and is brought on by the Plasmodium parasite, persists as a serious illness, especially in areas with a high mosquito density. Traditional detection techniques, like examining blood samples with a microscope, tend to be labor-intensive, unreliable and necessitate specialized individuals. To address these challenges, we employed several customized convolutional neural networks (CNNs), including Parallel convolutional neural network (PCNN), Soft Attention Parallel Convolutional Neural Networks (SPCNN), and Soft Attention after Functional Block Parallel Convolutional Neural Networks (SFPCNN), to improve the effectiveness of malaria diagnosis. Among these, the SPCNN emerged as the most successful model, outperforming all other models in evaluation metrics. The SPCNN achieved a precision of 99.38 0.21%, recall of 99.37 0.21%, F1 score of 99.37 0.21%, accuracy of 99.37 ± 0.30%, and an area under the receiver operating characteristic curve (AUC) of 99.95 ± 0.01%, demonstrating its robustness in detecting malaria parasites. Furthermore, we employed various transfer learning (TL) algorithms, including VGG16, ResNet152, MobileNetV3Small, EfficientNetB6, EfficientNetB7, DenseNet201, Vision Transformer (ViT), Data-efficient Image Transformer (DeiT), ImageIntern, and Swin Transformer (versions v1 and v2). The proposed SPCNN model surpassed all these TL methods in every evaluation measure. The SPCNN model, with 2.207 million parameters and a size of 26 MB, is more complex than PCNN but simpler than SFPCNN. Despite this, SPCNN exhibited the fastest testing times (0.00252 s), making it more computationally efficient than both PCNN and SFPCNN. We assessed model interpretability using feature activation maps, Gradient-weighted Class Activation Mapping (Grad-CAM) and SHapley Additive exPlanations (SHAP) visualizations for all three architectures, illustrating why SPCNN outperformed the others. The findings from our experiments show a significant improvement in malaria parasite diagnosis. The proposed approach outperforms traditional manual microscopy in terms of both accuracy and speed. This study highlights the importance of utilizing cutting-edge technologies to develop robust and effective diagnostic tools for malaria prevention.

Journal Article

Share this book

Add to My Shelf

Deep Convolutional Neural Network for Flood Extent Mapping Using Unmanned Aerial Vehicles Data

by Hashemi-Beni, Leila , Thompson, Gary , Langan, Thomas E. in convolutional neural networks , floodplain mapping , fully convolutional network

2019

Flooding is one of the leading threats of natural disasters to human life and property, especially in densely populated urban areas. Rapid and precise extraction of the flooded areas is key to supporting emergency-response planning and providing damage assessment in both spatial and temporal measurements. Unmanned Aerial Vehicles (UAV) technology has recently been recognized as an efficient photogrammetry data acquisition platform to quickly deliver high-resolution imagery because of its cost-effectiveness, ability to fly at lower altitudes, and ability to enter a hazardous area. Different image classification methods including SVM (Support Vector Machine) have been used for flood extent mapping. In recent years, there has been a significant improvement in remote sensing image classification using Convolutional Neural Networks (CNNs). CNNs have demonstrated excellent performance on various tasks including image classification, feature extraction, and segmentation. CNNs can learn features automatically from large datasets through the organization of multi-layers of neurons and have the ability to implement nonlinear decision functions. This study investigates the potential of CNN approaches to extract flooded areas from UAV imagery. A VGG-based fully convolutional network (FCN-16s) was used in this research. The model was fine-tuned and a k-fold cross-validation was applied to estimate the performance of the model on the new UAV imagery dataset. This approach allowed FCN-16s to be trained on the datasets that contained only one hundred training samples, and resulted in a highly accurate classification. Confusion matrix was calculated to estimate the accuracy of the proposed method. The image segmentation results obtained from FCN-16s were compared from the results obtained from FCN-8s, FCN-32s and SVMs. Experimental results showed that the FCNs could extract flooded areas precisely from UAV images compared to the traditional classifiers such as SVMs. The classification accuracy achieved by FCN-16s, FCN-8s, FCN-32s, and SVM for the water class was 97.52%, 97.8%, 94.20% and 89%, respectively.

Journal Article

Share this book

Add to My Shelf

Skin Lesion Analysis towards Melanoma Detection Using Deep Learning Network

by Li, Yuexiang , Shen, Linlin in deep convolutional network , Dermoscopy , fully-convolutional residual network

2018

Skin lesions are a severe disease globally. Early detection of melanoma in dermoscopy images significantly increases the survival rate. However, the accurate recognition of melanoma is extremely challenging due to the following reasons: low contrast between lesions and skin, visual similarity between melanoma and non-melanoma lesions, etc. Hence, reliable automatic detection of skin tumors is very useful to increase the accuracy and efficiency of pathologists. In this paper, we proposed two deep learning methods to address three main tasks emerging in the area of skin lesion image processing, i.e., lesion segmentation (task 1), lesion dermoscopic feature extraction (task 2) and lesion classification (task 3). A deep learning framework consisting of two fully convolutional residual networks (FCRN) is proposed to simultaneously produce the segmentation result and the coarse classification result. A lesion index calculation unit (LICU) is developed to refine the coarse classification results by calculating the distance heat-map. A straight-forward CNN is proposed for the dermoscopic feature extraction task. The proposed deep learning frameworks were evaluated on the ISIC 2017 dataset. Experimental results show the promising accuracies of our frameworks, i.e., 0.753 for task 1, 0.848 for task 2 and 0.912 for task 3 were achieved.

Journal Article

Share this book

Add to My Shelf

Foreground Detection with Deeply Learned Multi-Scale Spatial-Temporal Features

by Wang, Yao , Yu, Zujun , Zhu, Liqiang in 3D convolutional networks , background modeling , deep learning

2018

Foreground detection, which extracts moving objects from videos, is an important and fundamental problem of video analysis. Classic methods often build background models based on some hand-craft features. Recent deep neural network (DNN) based methods can learn more effective image features by training, but most of them do not use temporal feature or use simple hand-craft temporal features. In this paper, we propose a new dual multi-scale 3D fully-convolutional neural network for foreground detection problems. It uses an encoder–decoder structure to establish a mapping from image sequences to pixel-wise classification results. We also propose a two-stage training procedure, which trains the encoder and decoder separately to improve the training results. With multi-scale architecture, the network can learning deep and hierarchical multi-scale features in both spatial and temporal domains, which is proved to have good invariance for both spatial and temporal scales. We used the CDnet dataset, which is currently the largest foreground detection dataset, to evaluate our method. The experiment results show that the proposed method achieves state-of-the-art results in most test scenes, comparing to current DNN based methods.

Journal Article

Share this book

Add to My Shelf

Skeleton-Based Spatio-Temporal U-Network for 3D Human Pose Estimation in Video

by Li, Weiwei , Chen, Shudong , Du, Rong in 3D pose estimation , Data Compression , graph convolutional networks

2022

Despite the great progress in 3D pose estimation from videos, there is still a lack of effective means to extract spatio-temporal features of different granularity from complex dynamic skeleton sequences. To tackle this problem, we propose a novel, skeleton-based spatio-temporal U-Net(STUNet) scheme to deal with spatio-temporal features in multiple scales for 3D human pose estimation in video. The proposed STUNet architecture consists of a cascade structure of semantic graph convolution layers and structural temporal dilated convolution layers, progressively extracting and fusing the spatio-temporal semantic features from fine-grained to coarse-grained. This U-shaped network achieves scale compression and feature squeezing by downscaling and upscaling, while abstracting multi-resolution spatio-temporal dependencies through skip connections. Experiments demonstrate that our model effectively captures comprehensive spatio-temporal features in multiple scales and achieves substantial improvements over mainstream methods on real-world datasets.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter