Catalogue Search | MBRL

CWC-transformer: a visual transformer approach for compressed whole slide image classification

by Yang, Yun , Duan, Yongchun , Wang, Yaowei in Annotations , Artificial Intelligence , Artificial neural networks

2025

The rapid development of Artificial Intelligence (AI) technology accelerates the application of computational pathology in clinical decision-making. Due to the restriction of computing resources and annotation information, it is challenging for AI-based computational pathology methods to effectively process and analyze the gigapixel whole slide image (WSI). Conventional methods utilize multiple instance learning (MIL) to convert WSI into patches for classification. However, without the patch-level annotation, it is difficult to extract discriminative features, even with pre-trained networks. Furthermore, forcibly applying the patch-level conversion will break the pathological characteristics of WSI from the spatial structure. In this study, we present a two-stage framework named Compressed WSI Classification (CWC-Transformer) to effectively solve the problems of feature extraction and spatial information loss in WSI classification. In the compression stage, we adopt contrastive learning to present a feature compression method, which not only extracts the discriminative features but also decreases the data deviation caused by staining and scanning inconsistency. In the learning stage, we extend the advantages of the convolutional neural network and transformer mechanism to enhance the co-relations between local and global information to provide the final results jointly. Experiments on three large-scale public datasets of different tasks show that our proposed framework outperforms other advanced methods in terms of robustness and interpretation.

Journal Article

Share this book

Add to My Shelf

Diversity Learning Based on Multi-Latent Space for Medical Image Visual Question Generation

by Togo, Ren , Ogawa, Takahiro , Zhu, He in Automation , Computational linguistics , computer vision

2023

Auxiliary clinical diagnosis has been researched to solve unevenly and insufficiently distributed clinical resources. However, auxiliary diagnosis is still dominated by human physicians, and how to make intelligent systems more involved in the diagnosis process is gradually becoming a concern. An interactive automated clinical diagnosis with a question-answering system and a question generation system can capture a patient’s conditions from multiple perspectives with less physician involvement by asking different questions to drive and guide the diagnosis. This clinical diagnosis process requires diverse information to evaluate a patient from different perspectives to obtain an accurate diagnosis. Recently proposed medical question generation systems have not considered diversity. Thus, we propose a diversity learning-based visual question generation model using a multi-latent space to generate informative question sets from medical images. The proposed method generates various questions by embedding visual and language information in different latent spaces, whose diversity is trained by our newly proposed loss. We have also added control over the categories of generated questions, making the generated questions directional. Furthermore, we use a new metric named similarity to accurately evaluate the proposed model’s performance. The experimental results on the Slake and VQA-RAD datasets demonstrate that the proposed method can generate questions with diverse information. Our model works with an answering model for interactive automated clinical diagnosis and generates datasets to replace the process of annotation that incurs huge labor costs.

Journal Article

Share this book

Add to My Shelf

Towards a guideline for evaluation metrics in medical image segmentation

by Kramer, Frank , Müller, Dominik , Soto-Rey, Iñaki in Accuracy , Algorithms , Artificial Intelligence

2022

In the last decade, research on artificial intelligence has seen rapid growth with deep learning models, especially in the field of medical image segmentation. Various studies demonstrated that these models have powerful prediction capabilities and achieved similar results as clinicians. However, recent studies revealed that the evaluation in image segmentation studies lacks reliable model performance assessment and showed statistical bias by incorrect metric implementation or usage. Thus, this work provides an overview and interpretation guide on the following metrics for medical image segmentation evaluation in binary as well as multi-class problems: Dice similarity coefficient, Jaccard, Sensitivity, Specificity, Rand index, ROC curves, Cohen’s Kappa, and Hausdorff distance. Furthermore, common issues like class imbalance and statistical as well as interpretation biases in evaluation are discussed. As a summary, we propose a guideline for standardized medical image segmentation evaluation to improve evaluation quality, reproducibility, and comparability in the research field.

Journal Article

Share this book

Add to My Shelf

Multi-instance discriminative contrastive learning for brain image representation

by Liu, Shuhui , Shang, Xuequn , Qu, Xiran in Artificial Intelligence , Brain , Computational Biology/Bioinformatics

2025

This paper focuses on the problem of learning discriminative representation for brain images, which is a critical task toward understanding brain developments. Related studies usually extract manual and statistical features from the functional magnetic resonance images (MRIs) to differentiate brain patterns. However, these features fail to consider the implicit and high-order variances, and the existing representation methods often suffer from the weak manual features and the small-size sample. This paper introduces a weakly-supervised representation learning model, dubbed multi-instance discriminative contrastive learning (MIDCL), to identify the different MRI patterns. MIDCL yields two versions for each instance of a subject by introducing noise patterns and then achieves latent representations for them via training an encoder network and a projection network. Due to the multi-instance problem, MIDCL simultaneously minimizes an unsupervised contrastive loss (UCL) between the two representations at the level of instances and a supervised contrastive loss (SCL) between the two concatenated feature vectors at the level of subjects. We finally conducted experiments on two publicly available brain image datasets. The experiment results manifest that MIDCL could benefit from both UCL and SCL, thereby improving brain image classification performance in comparison with the state-of-the-art models.

Journal Article

Share this book

Add to My Shelf

Diabetic retinopathy classification based on dense connectivity and asymmetric convolutional neural network

by Chen, Jiaran , Zhang, Xinying , Peng, Yang in Accuracy , Algorithms , Artificial Intelligence

2025

Diabetic retinopathy (DR) is the leading cause of blindness in diabetics. The low contrast and microscopic nature of the lesions lead to a high false positive rate for automated DR screening. To address this issue, we propose a neural network named AC-DenseNet for the five-stage DR classification. In order to exploit the shallow features and enhance the feature extraction performance, dense connectivity is added to the convolution layer of AC-DenseNet. For the convolution layer to be more robust for DR detection in rotated or flipped pictures, asymmetric convolution branches are also introduced. In addition, attention mechanisms and auxiliary classifiers are incorporated into the network for the improvement of the performance of DR classification. We validate AC-DenseNet on the enhanced Kaggle dataset. The results show that AC-DenseNet can achieve 88.8% accuracy, 97.1% specificity, and 88.7% sensitivity, demonstrating that our model outperforms several state-of-the-art algorithms.

Journal Article

Share this book

Add to My Shelf

DH-GAC: deep hierarchical context fusion network with modified geodesic active contour for multiple neurofibromatosis segmentation

by Pu, Bin , Wu, Xiangqiong , Tan, Guanghua in Artificial Intelligence , Boundaries , Computational Biology/Bioinformatics

2025

Delineating accurately and simultaneously all lesions is vital and challenging for computer-aided diagnosis for multiple neurofibromatosis (NF). However, existing CNN-based segmentation methods paid little attention to weak boundaries. Moreover, due to the intensity-inhomogeneous distribution of medical images, the ambiguous boundaries, and highly variable locations, sizes and shapes of the lesions, delineating multiple lesions simultaneously remains quite challenging. To address these challenges, we introduce a novel end-to-end segmentation framework of multiple NF, deep hierarchical geodesic active contour (DH-GAC). It leverages the elaborately designed deep hierarchical context fusion network (DH-CFN) to improve the generalization and robustness of DH-GAC, and the modified geodesic active contour (MGAC) to delineate precisely all lesions as much as possible. Specifically, it employs DH-CFN to predict specific parameter maps of each image for MGAC and feeds them into the energy function of MGAC to delineate NF lesions, which makes DH-GAC end-to-end trainable. Moreover, to improve the generalization of DH-GAC, we adopt two different settings to initialize the surface for DH-GAC. Experimental results demonstrate that DH-GAC not only improves the segmentation precision, but also overcomes the intrinsic drawback of classical geodesic active contour in boundary delineation.

Journal Article

Share this book

Add to My Shelf

mixDA: mixup domain adaptation for glaucoma detection on fundus images

by Yan, Ming , Peng, Xi , Zeng, Zeng in Adaptation , Artificial Intelligence , Artificial neural networks

2025

Deep neural network has achieved promising results for automatic glaucoma detection on fundus images. Nevertheless, the intrinsic discrepancy across glaucoma datasets is challenging for the data-driven neural network approaches. This discrepancy leads to the domain gap that affects model performance and declines model generalization capability. Existing domain adaptation-based transfer learning methods mostly fine-tune pretrained models on target domains to reduce the domain gap. However, this feature learning-based adaptation method is implicit, and it is not an optimal solution for transfer learning on the diverse glaucoma datasets. In this paper, we propose a mixup domain adaptation ( mix DA) method that bridges domain adaptation with domain mixup to improve model performance across divergent glaucoma datasets. Specifically, the domain adaptation reduces the domain gap of glaucoma datasets in transfer learning with an explicit adaptation manner. Meanwhile, the domain mixup further minimizes the risk of outliers after domain adaptation and improves the model generalization capability. Extensive experiments show the superiority of our mix DA on several public glaucoma datasets. Moreover, our method outperforms state-of-the-art methods by a large margin on four glaucoma datasets: REFUGE, LAG, ORIGA, and RIM-ONE.

Journal Article

Share this book

Add to My Shelf

SureUnet: sparse autorepresentation encoder U-Net for noise artifact suppression in low-dose CT

by Qiang, Jun , Liu, Jin , Zhang, Yikun in Artificial Intelligence , Coders , Computational Biology/Bioinformatics

2025

Low-dose computed tomography (LDCT) is desirable due to ionizing radiation, but the resulting images suffer from serious streak artifacts and spot noise. Recently, deep learning (DL)-based methods have emerged as promising alternatives for medical image processing. However, most DL-based methods are built intuitively and lack interpretability, and it is difficult to effectively separate the artifacts and noise in LDCT images. Obtaining diagnostically useful images, especially when using a low-dose scanner protocol, remains an open challenge. To improve the quality of LDCT images, we developed a novel processing network called the sparse autorepresentation U-Net (SureUnet). First, inspired by multilayer convolutional sparse coding (CSC), we constructed a sparse autorepresentation encoder to sufficiently capture and represent hierarchical image features. Then, we chose the widely used U-Net model for sparse autorepresentation block applications and designed SureUnet by adding a feature decoding block. Therefore, every module has well-defined interpretability in our network. Additionally, hybrid loss functions were specifically designed, including the mean absolute error, edge loss and perceptual loss. Through the cooperation of multiple loss functions, the noise artifact suppression effect of the network was improved. The visual results obtained on the MAYO and UIH datasets show that the proposed method’s noise artifact suppression effect was more significant. The quantitative results showed promising improvement levels compared to those of the other state-of-the-art methods. The SureUnet model significantly outperformed the compared methods on two datasets, with margins of 0.4 dB for the PSNR, 0.007 for the SSIM, and 1.6 for the FID on the MAYO dataset and margins of 0.5 dB for the PSNR, 0.004 for the SSIM and 2.9 for the FID on the UIH dataset. This work paves the way for sparse autorepresentation in DL for processing LDCT images. Experimental results have demonstrated the competitive performance of SureUnet in terms of noise suppression, structural fidelity and visual impression improvement.

Journal Article

Share this book

Add to My Shelf

Most relevant point query on road networks

by Yang, Shenghong , Qin, Yunchuan , Yang, Zhibang in Algorithms , Apexes , Artificial Intelligence

2025

Graphs are widespread in many real-life practical applications. One of a graph’s fundamental and popular researches is investigating the relations between two given vertices. The relationship between nodes in the graph can be measured by the shortest distance. Moreover, the number of paths is also a popular metric to assess the relationship of different nodes. In many location-based services, users make decisions on the basis of both the two metrics. To address this problem, we propose a new hybrid-metric based on the number of paths with a distance constraint for road networks, which are special graphs. Based on it, a most relevant node query on road networks is identified. To handle this problem, we first propose a Shortest-Distance Constrained DFS, which uses the shortest distance to prune unqualified nodes. To further improve query efficiency, we present Batch Query DFS algorithm, which only needs only one DFS search. Our experiments on four real-life road networks demonstrate the performance of the proposed algorithms.

Journal Article

Share this book

Add to My Shelf

CSPP-IQA: a multi-scale spatial pyramid pooling-based approach for blind image quality assessment

by Zhou, Xiaokang , Chen, Jingjing , Yan, Ke in Algorithms , Artificial Intelligence , Artificial neural networks

2022

The traditional image quality assessment (IQA) methods are usually based on convolutional neural networks (CNNs). For these IQA methods using CNNs, limited by the feature size of the fully connected layer, the input image needs be tailored to a pre-defined size, which usually results in destroying the original structure and content of the input image and thus reduces the accuracy of the quality assessment. In this paper, a blind image quality assessment method (named CSPP-IQA), which is based on multi-scale spatial pyramid pooling, is proposed. CSPP-IQA allows inputting the original image when assessing the image quality without any image adjustment. Moreover, by facilitating the convolutional block attention module and image understanding module, CSPP-IQA achieved better accuracy, generalization and efficiency than traditional IQA methods. The result of experiments running on real-scene IQA datasets in this study verified the effectiveness and efficiency of CSPP-IQA.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter