Catalogue Search | MBRL

Knee osteoarthritis severity prediction using an attentive multi-scale deep convolutional neural network

by Jain, Rohit Kumar , Sharma, Prasen Kumar , Gaj, Sibaji in Arthritis , Artificial neural networks , Computed tomography

2024

Knee Osteoarthritis (OA) is a destructive joint disease identified by joint stiffness, pain, and functional disability concerning millions of lives across the globe. It is generally assessed by evaluating physical symptoms, medical history, and other joint screening tests like radiographs, Magnetic Resonance Imaging (MRI), and Computed Tomography (CT) scans. Unfortunately, the conventional methods are very subjective, which forms a barrier in detecting the disease progression at an early stage. This paper presents a deep learning-based framework, namely OsteoHRNet, that automatically assesses the Knee OA severity in terms of Kellgren and Lawrence (KL) grade classification from X-rays. As a primary novelty, the proposed approach is built upon one of the most recent deep models, called the High-Resolution Network (HRNet), to capture the multi-scale features of knee X-rays. In addition, an attention mechanism has been incorporated to filter out the counterproductive features and boost the performance further. Our proposed model has achieved the best multi-class accuracy of 71.74% and MAE of 0.311 on the baseline cohort of the OAI dataset, which is a remarkable gain over the existing best-published works. Additionally, Gradient-based Class Activation Maps (Grad-CAMs) have been employed to justify the proposed network learning.

Journal Article

Share this book

Add to My Shelf

Research on automatic identification and evaluation method of piano playing skills based on convolutional neural network

by Wu, Xiaoliang in 05C82 , Dynamic temporal regularization (DTW) algorithm , G-HRNet model

2025

This paper proposes a deep learning based piano hand fingering recognition system using YOLOv3 target detection algorithm for network training of the model. Process the images in the dataset and get the output of the network, use the trained model for target prediction. Based on this, a high-resolution network HRNet method is used to realize the recognition of piano playing techniques. At the same time, a Dynamic Time Warping (DTW) algorithm is introduced to calculate the similarity between playing techniques and standard techniques to complete the automatic evaluation. Finally, the performance dataset is utilized to verify the effectiveness of the recognition method in this paper. The results show that the G-HRNet model can effectively extract the angular features of the upper and lower joints of the player’s fingers, and its recognition accuracy is above 95% in both the training and test sets. In addition, the fluctuation of the pitch angle of the upper joints of the fingers at different times can be clearly seen using this model, and the recognition results are very clear. The scoring results for each finger of the subjects are realistic and can well reflect the flexibility and span of each finger. In the case evaluation results, the recognition accuracy of this paper’s model for video fingering and audio is above 85% and 90%, respectively. The evaluation standard of this fingering is reached.

Journal Article

Share this book

Add to My Shelf

Human Pose Estimation Based on Efficient and Lightweight High-Resolution Network (EL-HRNet)

by He, Duo , Yan, An , Yang, Shiqiang in Accuracy , Analysis , CBAM

2024

As an important direction in computer vision, human pose estimation has received extensive attention in recent years. A High-Resolution Network (HRNet) can achieve effective estimation results as a classical human pose estimation method. However, the complex structure of the model is not conducive to deployment under limited computer resources. Therefore, an improved Efficient and Lightweight HRNet (EL-HRNet) model is proposed. In detail, point-wise and grouped convolutions were used to construct a lightweight residual module, replacing the original 3 × 3 module to reduce the parameters. To compensate for the information loss caused by the network’s lightweight nature, the Convolutional Block Attention Module (CBAM) is introduced after the new lightweight residual module to construct the Lightweight Attention Basicblock (LA-Basicblock) module to achieve high-precision human pose estimation. To verify the effectiveness of the proposed EL-HRNet, experiments were carried out using the COCO2017 and MPII datasets. The experimental results show that the EL-HRNet model requires only 5 million parameters and 2.0 GFlops calculations and achieves an AP score of 67.1% on the COCO2017 validation set. In addition, PCKh@0.5mean is 87.7% on the MPII validation set, and EL-HRNet shows a good balance between model complexity and human pose estimation accuracy.

Journal Article

Share this book

Add to My Shelf

Accurate UAV Small Object Detection Based on HRFPN and EfficentVMamba

by Guo, Chengcheng , Wu, Shixiao , Guo, Hong in Accuracy , Algorithms , Analysis

2024

(1) Background: Small objects in Unmanned Aerial Vehicle (UAV) images are often scattered throughout various regions of the image, such as the corners, and may be blocked by larger objects, as well as susceptible to image noise. Moreover, due to their small size, these objects occupy a limited area in the image, resulting in a scarcity of effective features for detection. (2) Methods: To address the detection of small objects in UAV imagery, we introduce a novel algorithm called High-Resolution Feature Pyramid Network Mamba-Based YOLO (HRMamba-YOLO). This algorithm leverages the strengths of a High-Resolution Network (HRNet), EfficientVMamba, and YOLOv8, integrating a Double Spatial Pyramid Pooling (Double SPP) module, an Efficient Mamba Module (EMM), and a Fusion Mamba Module (FMM) to enhance feature extraction and capture contextual information. Additionally, a new Multi-Scale Feature Fusion Network, High-Resolution Feature Pyramid Network (HRFPN), and FMM improved feature interactions and enhanced the performance of small object detection. (3) Results: For the VisDroneDET dataset, the proposed algorithm achieved a 4.4% higher Mean Average Precision (mAP) compared to YOLOv8-m. The experimental results showed that HRMamba achieved a mAP of 37.1%, surpassing YOLOv8-m by 3.8% (Dota1.5 dataset). For the UCAS_AOD dataset and the DIOR dataset, our model had a mAP 1.5% and 0.3% higher than the YOLOv8-m model, respectively. To be fair, all the models were trained without a pre-trained model. (4) Conclusions: This study not only highlights the exceptional performance and efficiency of HRMamba-YOLO in small object detection tasks but also provides innovative solutions and valuable insights for future research.

Journal Article

Share this book

Add to My Shelf

EMR-HRNet: A Multi-Scale Feature Fusion Network for Landslide Segmentation from Remote Sensing Images

by Liu, Xiaosheng , Jin, Yuanhang , Huang, Xiaobin in Accuracy , Artificial intelligence , attention mechanism

2024

Landslides constitute a significant hazard to human life, safety and natural resources. Traditional landslide investigation methods demand considerable human effort and expertise. To address this issue, this study introduces an innovative landslide segmentation framework, EMR-HRNet, aimed at enhancing accuracy. Initially, a novel data augmentation technique, CenterRep, is proposed, not only augmenting the training dataset but also enabling the model to more effectively capture the intricate features of landslides. Furthermore, this paper integrates a RefConv and Multi-Dconv Head Transposed Attention (RMA) feature pyramid structure into the HRNet model, augmenting the model’s capacity for semantic recognition and expression at various levels. Last, the incorporation of the Dilated Efficient Multi-Scale Attention (DEMA) block substantially widens the model’s receptive field, bolstering its capability to discern local features. Rigorous evaluations on the Bijie dataset and the Sichuan and surrounding area dataset demonstrate that EMR-HRNet outperforms other advanced semantic segmentation models, achieving mIoU scores of 81.70% and 71.68%, respectively. Additionally, ablation studies conducted across the comprehensive dataset further corroborate the enhancements’ efficacy. The results indicate that EMR-HRNet excels in processing satellite and UAV remote sensing imagery, showcasing its significant potential in multi-source optical remote sensing for landslide segmentation.

Journal Article

Share this book

Add to My Shelf

Instance Segmentation Method for Insulators in Complex Backgrounds Based on Improved SOLOv2

by Du, Xiaodong , Zhao, Shaokang , Ji, Yangpeng in Accuracy , Algorithms , Datasets

2025

To precisely delineate the contours of insulators in complex transmission line images obtained from Unmanned Aerial Vehicle (UAV) inspections and thereby facilitate subsequent defect analysis, this study proposes an instance segmentation framework predicated upon an enhanced SOLOv2 model. The proposed framework integrates a preprocessed edge channel, generated through the Non-Subsampled Contourlet Transform (NSCT), which augments the model’s capability to accurately capture the edges of insulators. Moreover, the input image resolution to the network is heightened to 1200 × 1600, permitting more detailed extraction of edges. Rather than the original ResNet + FPN architecture, the improved HRNet is utilized as the backbone to effectively harness multi-scale feature information, thereby enhancing the model’s overall efficacy. In response to the increased input size, there is a reduction in the network’s channel count, concurrent with an increase in the number of layers, ensuring an adequate receptive field without substantially escalating network parameters. Additionally, a Convolutional Block Attention Module (CBAM) is incorporated to refine mask quality and augment object detection precision. Furthermore, to bolster the model’s robustness and minimize annotation demands, a virtual dataset is crafted utilizing the fourth-generation Unreal Engine (UE4). Empirical results reveal that the proposed framework exhibits superior performance, with AP0.50 (90.21%), AP0.75 (83.34%), and AP[0.50:0.95] (67.26%) on a test set consisting of images supplied by the power grid. This framework surpasses existing methodologies and contributes significantly to the advancement of intelligent transmission line inspection.

Journal Article

Share this book

Add to My Shelf

MPE-HRNetsup.L: A Lightweight High-Resolution Network for Multispecies Animal Pose Estimation

by Jiquan Shen , Yaning Jiang , Wei Wang in Analysis , Animal behavior , Artificial intelligence

2024

Animal pose estimation is crucial for animal health assessment, species protection, and behavior analysis. It is an inevitable and unstoppable trend to apply deep learning to animal pose estimation. In many practical application scenarios, pose estimation models must be deployed on edge devices with limited resource. Therefore, it is essential to strike a balance between model complexity and accuracy. To address this issue, we propose a lightweight network model, i.e., MPE-HRNet.[sup.L], by improving Lite-HRNet. The improvements are threefold. Firstly, we improve Spatial Pyramid Pooling-Fast and apply it and the improved version to different branches. Secondly, we construct a feature extraction module based on a mixed pooling module and a dual spatial and channel attention mechanism, and take the feature extraction module as the basic module of MPE-HRNet.[sup.L]. Thirdly, we introduce a feature enhancement stage to enhance important features. The experimental results on the AP-10K dataset and the Animal Pose dataset verify the effectiveness and efficiency of MPE-HRNet.[sup.L].

Journal Article

Share this book

Add to My Shelf

MPE-HRNetL: A Lightweight High-Resolution Network for Multispecies Animal Pose Estimation

by Luo, Junwei , Shen, Jiquan , Jiang, Yaning in Accuracy , animal pose estimation , Animals

2024

Animal pose estimation is crucial for animal health assessment, species protection, and behavior analysis. It is an inevitable and unstoppable trend to apply deep learning to animal pose estimation. In many practical application scenarios, pose estimation models must be deployed on edge devices with limited resource. Therefore, it is essential to strike a balance between model complexity and accuracy. To address this issue, we propose a lightweight network model, i.e., MPE-HRNet.L, by improving Lite-HRNet. The improvements are threefold. Firstly, we improve Spatial Pyramid Pooling-Fast and apply it and the improved version to different branches. Secondly, we construct a feature extraction module based on a mixed pooling module and a dual spatial and channel attention mechanism, and take the feature extraction module as the basic module of MPE-HRNet.L. Thirdly, we introduce a feature enhancement stage to enhance important features. The experimental results on the AP-10K dataset and the Animal Pose dataset verify the effectiveness and efficiency of MPE-HRNet.L.

Journal Article

Share this book

Add to My Shelf

A Novel Intelligent Classification Method for Urban Green Space Based on High-Resolution Remote Sensing Images

by Xu, Zhiyu , Wang, Litao , Wang, Shixin in Accuracy , artificial intelligence , Automation

2020

The real-time, accurate, and refined monitoring of urban green space status information is of great significance in the construction of urban ecological environment and the improvement of urban ecological benefits. The high-resolution technology can provide abundant information of ground objects, which makes the information of urban green surface more complicated. The existing classification methods are challenging to meet the classification accuracy and automation requirements of high-resolution images. This paper proposed a deep learning classification method for urban green space based on phenological features constraints in order to make full use of the spectral and spatial information of green space provided by high-resolution remote sensing images (GaoFen-2) in different periods. The vegetation phenological features were added as auxiliary bands to the deep learning network for training and classification. We used the HRNet (High-Resolution Network) as our model and introduced the Focal Tversky Loss function to solve the sample imbalance problem. The experimental results show that the introduction of phenological features into HRNet model training can effectively improve urban green space classification accuracy by solving the problem of misclassification of evergreen and deciduous trees. The improvement rate of F1-Score of deciduous trees, evergreen trees, and grassland were 0.48%, 4.77%, and 3.93%, respectively, which proved that the combination of vegetation phenology and high-resolution remote sensing image can improve the results of deep learning urban green space classification.

Journal Article

Share this book

Add to My Shelf

Segmentation and counting of wheat spike grains based on deep learning and textural feature

by Ma, Xinming , Geng, Qing , Xu, Xin in Accuracy , Agricultural production , Algorithms

2023

Background Grain count is crucial to wheat yield composition and estimating yield parameters. However, traditional manual counting methods are time-consuming and labor-intensive. This study developed an advanced deep learning technique for the segmentation counting model of wheat grains. This model has been rigorously tested on three distinct wheat varieties: ‘Bainong 307’, ‘Xinmai 26’, and ‘Jimai 336’, and it has achieved unprecedented predictive counting accuracy. Method The images of wheat ears were taken with a smartphone at the late stage of wheat grain filling. We used image processing technology to preprocess and normalize the images to 480*480 pixels. A CBAM-HRNet wheat grain segmentation counting deep learning model based on the Convolutional Block Attention Module (CBAM) was constructed by combining deep learning, migration learning, and attention mechanism. Image processing algorithms and wheat grain texture features were used to build a grain counting and predictive counting model for wheat grains. Results The CBAM-HRNet model using the CBAM was the best for wheat grain segmentation. Its segmentation accuracy of 92.04%, the mean Intersection over Union (mIoU) of 85.21%, the category mean pixel accuracy (mPA) of 91.16%, and the recall rate of 91.16% demonstrate superior robustness compared to other models such as HRNet, PSPNet, DeeplabV3+ , and U-Net. Method I for spike count, which calculates twice the number of grains on one side of the spike to determine the total number of grains, demonstrates a coefficient of determination R 2 of 0.85, a mean absolute error (MAE) of 1.53, and a mean relative error (MRE) of 2.91. In contrast, Method II for spike count involves summing the number of grains on both sides to determine the total number of grains, demonstrating a coefficient of determination R 2 of 0.92, an MAE) of 1.15, and an MRE) of 2.09%. Conclusions Image segmentation algorithm of the CBAM-HRNet wheat spike grain is a powerful solution that uses the CBAM to segment wheat spike grains and obtain richer semantic information. This model can effectively address the challenges of small target image segmentation and under-fitting problems in training. Additionally, the spike grain counting model can quickly and accurately predict the grain count of wheat, providing algorithmic support for efficient and intelligent wheat yield estimation.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter