Catalogue Search | MBRL

Improved YOLO-Based Pulmonary Nodule Detection with Spatial-SE Attention and an Aspect Ratio Penalty

by Gao, Tianding , Gou, Jianping , Cheng, Nuo in Accuracy , Algorithms , Artificial intelligence

2025

The accurate identification of pulmonary nodules is critical for the early diagnosis of lung diseases; however, this task remains challenging due to inadequate feature representation and limited localization sensitivity. Current methodologies often utilize channel attention mechanisms and intersection over union (IoU)-based loss functions. Yet, they frequently overlook spatial context and struggle to capture subtle variations in aspect ratios, which hinders their ability to detect small objects. In this study, we introduce an improved YOLOV11 framework that addresses these limitations through two primary components: a spatial squeeze-and-excitation (SSE) module that concurrently models channel-wise and spatial attention to enhance the discriminative features pertinent to nodules and explicit aspect ratio penalty IoU (EAPIoU) loss that imposes a direct penalty on the squared differences in aspect ratios to refine the bounding box regression process. Comprehensive experiments conducted on the LUNA16, LungCT, and Node21 datasets reveal that our approach achieves superior precision, recall, and mean average precision (mAP) across various IoU thresholds, surpassing previous state-of-the-art methods while maintaining computational efficiency. Specifically, the proposed SSE module achieves a precision of 0.781 on LUNA16, while the EAPIoU loss boosts mAP@50 to 92.4% on LungCT, outperforming mainstream attention mechanisms and IoU-based loss functions. These findings underscore the effectiveness of integrating spatially aware attention mechanisms with aspect ratio-sensitive loss functions for robust nodule detection.

Journal Article

Share this book

Add to My Shelf

Exploring the endangerment mechanisms of Hipposideros pomona based on molecular phylogeographic methods

by Ma, Liqun , Bu, Yanzhen , Niu, Hongxing in Anthropogenic factors , Bats , Bayesian analysis

2023

The endangerment mechanisms of various species are a focus of studies on biodiversity and conservation biology. Hipposideros pomona is an endangered species, but the reasons behind its endangerment remain unclear. We investigated the endangerment mechanisms of H. pomona using mitochondrial DNA, nuclear DNA, and microsatellite loci markers. The results showed that the nucleotide diversity of mitochondria DNA and heterozygosity of microsatellite markers were high ( π = 0.04615, H O = 0.7115), whereas the nucleotide diversity of the nuclear genes was low ( THY : π = 0.00508, SORBS2 : π = 0.00677, ACOX2 : π = 0.00462, COPS7A : π = 0.00679). The phylogenetic tree and median‐joining network based on mitochondrial DNA sequences clustered the species into three clades, namely North Vietnam‐Fujian, Myanmar‐West Yunnan, and Laos‐Hainan clades. However, joint analysis of nuclear genes did not exhibit clustering. Analysis of molecular variance revealed a strong population genetic structure; IMa2 analysis did not reveal significant gene flow between all groups ( p > .05), and isolation‐by‐distance analysis revealed a significant positive correlation between genetic and geographic distances ( p < .05). The mismatch distribution analysis, neutral test, and Bayesian skyline plots revealed that the H. pomona population were relatively stable and exhibited a contraction trend. The results implied that H. pomona exhibits female philopatry and male‐biased dispersal. The Hengduan Mountains could have acted as a geographical barrier for gene flow between the North Vietnam‐Fujian clade and the Myanmar‐West Yunnan clade, whereas the Qiongzhou Strait may have limited interaction between the Hainan populations and other clades. The warm climate during the second interglacial Quaternary period ( c . 0.33 Mya) could have been responsible for species differentiation, whereas the cold climate during the late Quaternary last glacial maximum ( c . 10 ka BP) might have caused the overall contraction of species. The lack of significant gene flow in nuclear microsatellite loci markers among the different populations investigated reflects recent habitat fragmentation due to anthropogenic activities; thus, on‐site conservation of the species and restoration of gene flow corridors among populations need immediate implementation.

Journal Article

Share this book

Add to My Shelf

Deep learning model for the automated detection and classification of central canal and neural foraminal stenosis upon cervical spine magnetic resonance imaging

by Zhou, Yan , Jiang, Liang , Song, Xinhang in Adult , Aged , Analysis

2024

Background A deep learning (DL) model that can automatically detect and classify cervical canal and neural foraminal stenosis using cervical spine magnetic resonance imaging (MRI) can improve diagnostic accuracy and efficiency. Methods A method comprising region-of-interest (ROI) detection and cascade prediction was formulated for diagnosing cervical spinal stenosis based on a DL model. First, three part-specific convolutional neural networks were employed to detect the ROIs in different parts of the cervical MR images. Cascade prediction of the stenosis categories was subsequently performed to record the stenosis level and position on each patient slice. Finally, the results were combined to obtain a patient-level diagnostic report. Performance was evaluated based on the accuracy (ACC), area under the curve (AUC), sensitivity, specificity, F1 Score, diagnosis time of the DL model, and recall rate for ROI detection localization. Results The average recall rate of the ROI localization was 89.3% (neural foramen) and 99.7% (central canal) under the five-fold cross-validation of the DL model. In the dichotomous classification (normal or mild vs. moderate or severe), the ACC and AUC of the DL model were comparable to those of the radiologists, and the F1 score (84.8%) of the DL model was slightly higher than that of the radiologists (83.8%) for the central canal. Diagnosing whether the central canal or neural foramen of a slice is narrowed in the cervical MRI scan required an average of 15 and 0.098 s for the radiologists and DL model, respectively. Conclusions The DL model demonstrated comparable performance with subspecialist radiologists for the detection and classification of central canal and neural foraminal stenosis on cervical spine MRI. Moreover, the DL model demonstrated significant timesaving ability.

Journal Article

Share this book

Add to My Shelf

原发性胆汁性胆管炎的病理学分期系统

2022

原发性胆汁性胆管炎(PBC)是一种伴有胆汁淤积的慢性自身免疫性肝脏疾病，其组织学特征为非化脓性胆管炎。本文简述了传统的PBC病理分期系统如Rubin分期、Scheuer分期以及Ludwig分期和最新的Nakanuma分期各自的优点和局限性，其中Nakanuma分期细化了组织学分级分期标准，减少因采样误差而漏诊的机会，从而为临床提供更充分的诊断及预后信息。新旧分期系统结合应用更有利于指导PBC的诊治和相关研究。

Journal Article

Share this book

Add to My Shelf

Automated diagnostic of cervical spondylosis on multimodal medical images with a multi-task deep learning model

by Zhou, Yan , Ni, Ming , Zhao, Fangbo

2026

Cervical spondylosis is one of the most common degenerative diseases, seriously affecting life quality. Unlike diseases with explicit lesions like cancer, hydroncus, or fracture, the degeneration of the cervical spine cannot be explicitly detected from the appearance of medical images, requiring extensive experience of doctors to interpret subtle clues. However, the extremely high incidence of cervical spondylosis coincides with a serious shortage of experienced doctors and uneven distribution of medical resources, hindering early diagnosis. We propose a cascade-ensemble deep learning framework for cervical spondylosis diagnosis. The framework integrates vertebral body detection and degenerative diagnosis through a cascading architecture, and jointly trains an ensemble of degenerative indicators in a multi-task learning manner. We demonstrate that deep learning models are more sensitive to distance and position based indicators than angle based ones. In intervertebral stenosis analysis, our method achieves comparable performance to senior radiologists and clinicians, with much faster diagnostic speed.

Journal Article

Share this book

Add to My Shelf

Research on the Influence of New Generation of Information Technology on Contemporary Enterprise Logistics Management Information System

by Zhang, Jiong , Zhang, Zongguo , Song, Xinhang in Electronic engineering , Image processing , Information management

2020

At present, information technology has been widely used in various fields, which has also led to the modernization of logistics information management in China. The new generation of information technology has penetrated into different professional fields, such as mechanical engineering, electronic engineering, communication engineering, image processing and so on. Through the information technology, enterprise logistics management has been moving towards the road of automation and intelligence, which is also the development trend of modern enterprises. At the same time, the new generation of information technology promotes the application of modern logistics management, which will help enterprises to move towards the process of logistics integration. First of all, this paper analyzes the important characteristics of modern logistics management. Then, this paper analyzes the information technology used in logistics management. Finally, some suggestions are put forward.

Journal Article

Share this book

Add to My Shelf

Scene Recognition with Prototype-agnostic Scene Layout

by Chen, Gongwei , Jiang, Shuqiang , Zeng, Haitao in Convolution , Datasets , Graphical representations

2019

Abstract--- Exploiting the spatial structure in scene images is a key research direction for scene recognition. Due to the large intra-class structural diversity, building and modeling flexible structural layout to adapt various image characteristics is a challenge. Existing structural modeling methods in scene recognition either focus on predefined grids or rely on learned prototypes, which all have limited representative ability. In this paper, we propose Prototype-agnostic Scene Layout (PaSL) construction method to build the spatial structure for each image without conforming to any prototype. Our PaSL can flexibly capture the diverse spatial characteristic of scene images and have considerable generalization capability. Given a PaSL, we build Layout Graph Network (LGN) where regions in PaSL are defined as nodes and two kinds of independent relations between regions are encoded as edges. The LGN aims to incorporate two topological structures (formed in spatial and semantic similarity dimensions) into image representations through graph convolution. Extensive experiments show that our approach achieves state-of-the-art results on widely recognized MIT67 and SUN397 datasets without multi-model or multi-scale fusion. Moreover, we also conduct the experiments on one of the largest scale datasets, Places365. The results demonstrate the proposed method can be well generalized and obtains competitive performance.

Paper

Share this book

Add to My Shelf

Learning Effective RGB-D Representations for Scene Recognition

by Chen, Chengpeng , Jiang, Shuqiang , Song, Xinhang in Architecture , Artificial neural networks , Color imagery

2018

Deep convolutional networks (CNN) can achieve impressive results on RGB scene recognition thanks to large datasets such as Places. In contrast, RGB-D scene recognition is still underdeveloped in comparison, due to two limitations of RGB-D data we address in this paper. The first limitation is the lack of depth data for training deep learning models. Rather than fine tuning or transferring RGB-specific features, we address this limitation by proposing an architecture and a two-step training approach that directly learns effective depth-specific features using weak supervision via patches. The resulting RGB-D model also benefits from more complementary multimodal features. Another limitation is the short range of depth sensors (typically 0.5m to 5.5m), resulting in depth images not capturing distant objects in the scenes that RGB images can. We show that this limitation can be addressed by using RGB-D videos, where more comprehensive depth information is accumulated as the camera travels across the scene. Focusing on this scenario, we introduce the ISIA RGB-D video dataset to evaluate RGB-D scene recognition with videos. Our video recognition architecture combines convolutional and recurrent neural networks (RNNs) that are trained in three steps with increasingly complex data to learn effective features (i.e. patches, frames and sequences). Our approach obtains state-of-the-art performances on RGB-D image (NYUD2 and SUN RGB-D) and video (ISIA RGB-D) scene recognition.

Paper

Share this book

Add to My Shelf

Depth CNNs for RGB-D scene recognition: learning from scratch better than transferring from RGB-CNNs

by Herranz, Luis , Jiang, Shuqiang , Song, Xinhang in Artificial neural networks , Color imagery , Datasets

2018

Scene recognition with RGB images has been extensively studied and has reached very remarkable recognition levels, thanks to convolutional neural networks (CNN) and large scene datasets. In contrast, current RGB-D scene data is much more limited, so often leverages RGB large datasets, by transferring pretrained RGB CNN models and fine-tuning with the target RGB-D dataset. However, we show that this approach has the limitation of hardly reaching bottom layers, which is key to learn modality-specific features. In contrast, we focus on the bottom layers, and propose an alternative strategy to learn depth features combining local weakly supervised training from patches followed by global fine tuning with images. This strategy is capable of learning very discriminative depth-specific features with limited depth images, without resorting to Places-CNN. In addition we propose a modified CNN architecture to further match the complexity of the model and the amount of data available. For RGB-D scene recognition, depth and RGB features are combined by projecting them in a common space and further leaning a multilayer classifier, which is jointly optimized in an end-to-end network. Our framework achieves state-of-the-art accuracy on NYU2 and SUN RGB-D in both depth only and combined RGB-D data.

Paper

Share this book

Add to My Shelf

Hierarchical Object-to-Zone Graph for Object Navigation

by Zhang, Sixian , Chu, Yakui , Jiang, Shuqiang in Deep learning , Navigation , Nodes

2021

The goal of object navigation is to reach the expected objects according to visual information in the unseen environments. Previous works usually implement deep models to train an agent to predict actions in real-time. However, in the unseen environment, when the target object is not in egocentric view, the agent may not be able to make wise decisions due to the lack of guidance. In this paper, we propose a hierarchical object-to-zone (HOZ) graph to guide the agent in a coarse-to-fine manner, and an online-learning mechanism is also proposed to update HOZ according to the real-time observation in new environments. In particular, the HOZ graph is composed of scene nodes, zone nodes and object nodes. With the pre-learned HOZ graph, the real-time observation and the target goal, the agent can constantly plan an optimal path from zone to zone. In the estimated path, the next potential zone is regarded as sub-goal, which is also fed into the deep reinforcement learning model for action prediction. Our methods are evaluated on the AI2-Thor simulator. In addition to widely used evaluation metrics SR and SPL, we also propose a new evaluation metric of SAE that focuses on the effective action rate. Experimental results demonstrate the effectiveness and efficiency of our proposed method.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter