Catalogue Search | MBRL

A False-Positive-Centric Framework for Object Detection Disambiguation

by Nitsche, Frank O. , Baur, Jasper in Classification , Data acquisition , Evacuations & rescues

2025

Existing frameworks for classifying the fidelity for object detection tasks do not consider false positive likelihood and object uniqueness. Inspired by the Detection, Recognition, Identification (DRI) framework proposed by Johnson 1958, we propose a new modified framework that defines three categories as visible anomaly, identifiable anomaly, and unique identifiable anomaly (AIU) as determined by human interpretation of imagery or geophysical data. These categories are designed to better capture false positive rates and emphasize the importance of identifying unique versus non-unique targets compared to the DRI Index. We then analyze visual, thermal, and multispectral UAV imagery collected over a seeded minefield and apply the AIU Index for the landmine detection use-case. We find that RGB imagery provided the most value per pixel, achieving a 100% identifiable anomaly rate at 125 pixels on target, and the highest unique target classification compared to thermal and multispectral imaging for the detection and identification of surface landmines and UXO. We also investigate how the AIU Index can be applied to machine learning for the selection of training data and informing the required action to take after object detection bounding boxes are predicted. Overall, the anomaly, identifiable anomaly, and unique identifiable anomaly index prescribes essential context for false-positive-sensitive or resolution-poor object detection tasks with applications in modality comparison, machine learning, and remote sensing data acquisition.

Journal Article

Share this book

Add to My Shelf

Image Interpretability of nSight-1 Nanosatellite Imagery for Remote Sensing Applications

by Mhangara, Paidamwoyo , Mapurisa, Willard , Mudau, Naledzani in image interpretability , image quality , land use and land cover applications

2020

Nanosatellites are increasingly being used in space-related applications to demonstrate and test scientific capability and engineering ingenuity of space-borne instruments and for educational purposes due to their favourable low manufacturing costs, cheaper launch costs, and short development time. The use of CubeSat to demonstrate earth imaging capability has also grown in the last two decades. In 2017, a South African company known as Space Commercial Services launched a low-orbit nanosatellite named nSight-1. The demonstration nanosatellite has three payloads that include a modular designed SCS Gecko imaging payload, FIPEX atmospheric science instrument developed by the University of Dresden and a Radiation mitigation VHDL coding experiment supplied by Nelson Mandela University. The Gecko imager has a swath width of 64 km and captures 30 m spatial resolution images using the red, green, and blue (RGB) spectral bands. The objective of this study was to assess the interpretability of nSight-1 in the spatial dimension using Landsat 8 as a reference and to recommend potential earth observation applications for the mission. A blind image spatial quality evaluator known as Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) was used to compute the image quality for nSight-1 and Landsat 8 imagery in the spatial domain and the National Imagery Interpretability Rating Scale (NIIRS) method to quantify the interpretability of the images. A visual interpretation was used to propose some potential applications for the nSight1 images. The results indicate that Landsat 8 OLI images had significantly higher image quality scores and NIIRS results compared to nSight-1. Landsat 8 has a mean of 19.299 for the image quality score while nSight-1 achieved a mean of 25.873. Landsat 8 had NIIRS mean of 2.345 while nSight-1 had a mean of 1.622. The superior image quality and image interpretability of Landsat could be attributed for the mature optical design on the Landsat 8 satellite that is aimed for operational purposes. Landsat 8 has a GDS of 30-m compared to 32-m on nSight-1. The image degradation resulting from the lossy compression implemented on nSight-1 from 12-bit to 8-bit also has a negative impact on image visual quality and interpretability. Whereas it is evident that Landsat 8 has the better visual quality and NIIRS scores, the results also showed that nSight-1 are still very good if one considers that the categorical ratings consider that images to be of good to excellent quality and a NIIRS mean of 1.6 indicates that the images are interpretable. Our interpretation of the imagery shows that the data has considerable potential for use in geo-visualization and cartographic land use and land cover mapping applications. The image analysis also showed the capability of the nSight-1 sensor to capture features related to structural geology, geomorphology and topography quite prominently.

Journal Article

Share this book

Add to My Shelf

A Novel Mixed Stimulation Pattern for Balanced Pulmonary EIT Imaging Performance

by Zhao, Zhanqi , Liu, Zilong , Zhu, Heyao in Accuracy , anti-noise performance , Comparative analysis

2026

Pulmonary electrical impedance tomography (EIT) offers non-invasive and real-time imaging in a compact device size, making it valuable for pulmonary ventilation monitoring. However, conventional EIT stimulation patterns face a trade-off dilemma between anti-noise performance and image interpretability. To address this challenge, we propose a novel mixed stimulation pattern that integrates opposite and adjacent stimulation patterns with a tunable weight ratio. The results of simulations and human experiments (involving 30 subjects) demonstrated that the mixed stimulation pattern uses 200 stimulation–measurement channels, preserves a high signal-to-noise ratio, improves lung separation, and reduces artifacts compared with the opposite and adjacent stimulation patterns. It maintained stable imaging at 600 μA of stimulation current amplitude (equivalent to 1 mA) and preserved most imaging and clinical indicators’ stability at 200 μA (except GI/RVDSD). The adjustable weight ratio enables imaging performance to be flexibly adjusted according to different noise levels in acquisition environments. In conclusion, the pattern we proposed offers a superior alternative to traditional patterns, achieving a favorable balance of real-time capability, anti-noise performance, and image interpretability for pulmonary EIT imaging.

Journal Article

Share this book

Add to My Shelf

COMPARATIVE STUDY OF THE DIFFERENT VERSIONS OF THE GENERAL IMAGE QUALITY EQUATION

by Valenzuela, A. Q. , Reyes, J. C. G. in Comparative studies , Confusion , Image processing

2019

The General Image Quality Equation (GIQE) is an analytical tool derived by regression modelling that is routinely employed to gauge the interpretability of raw and processed images, computing the most popular quantitative metric to evaluate image quality; the National Image Interpretability Rating Scale (NIIRS). There are three known versions of this equation; GIQE 3, GIQE 4 and GIQE 5, but the last one is scarcely known. The variety of versions, their subtleties, discontinuities and incongruences, generate confusion and problems among users. The first objective of this paper is to identify typical sources of confusion in the use of the GIQE, suggesting novel solutions to the main problems found in its application and presenting the derivation of a continuous form of GIQE 4, denominated GIQE 4C, that provides better correlation with GIQE 3 and GIQE 5. The second objective of this paper is to compare the predictions of GIQE 4C and GIQE 5, regarding the maximum image quality rating that can be achieved by image processing techniques. It is concluded that the transition from GIQE 4 to GIQE 5 is a major paradigm shift in image quality metrics, because it reduces the benefit of image processing techniques and enhances the importance of the raw image and its signal to noise ratio.

Journal Article

Share this book

Add to My Shelf

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

by Das Abhishek , Cogswell, Michael , Vedantam Ramakrishna in Artificial neural networks , Computer vision , Decisions

2020

We propose a technique for producing ‘visual explanations’ for decisions from a large class of Convolutional Neural Network (CNN)-based models, making them more transparent and explainable. Our approach—Gradient-weighted Class Activation Mapping (Grad-CAM), uses the gradients of any target concept (say ‘dog’ in a classification network or a sequence of words in captioning network) flowing into the final convolutional layer to produce a coarse localization map highlighting the important regions in the image for predicting the concept. Unlike previous approaches, Grad-CAM is applicable to a wide variety of CNN model-families: (1) CNNs with fully-connected layers (e.g.VGG), (2) CNNs used for structured outputs (e.g.captioning), (3) CNNs used in tasks with multi-modal inputs (e.g.visual question answering) or reinforcement learning, all without architectural changes or re-training. We combine Grad-CAM with existing fine-grained visualizations to create a high-resolution class-discriminative visualization, Guided Grad-CAM, and apply it to image classification, image captioning, and visual question answering (VQA) models, including ResNet-based architectures. In the context of image classification models, our visualizations (a) lend insights into failure modes of these models (showing that seemingly unreasonable predictions have reasonable explanations), (b) outperform previous methods on the ILSVRC-15 weakly-supervised localization task, (c) are robust to adversarial perturbations, (d) are more faithful to the underlying model, and (e) help achieve model generalization by identifying dataset bias. For image captioning and VQA, our visualizations show that even non-attention based models learn to localize discriminative regions of input image. We devise a way to identify important neurons through Grad-CAM and combine it with neuron names (Bau et al. in Computer vision and pattern recognition, 2017) to provide textual explanations for model decisions. Finally, we design and conduct human studies to measure if Grad-CAM explanations help users establish appropriate trust in predictions from deep networks and show that Grad-CAM helps untrained users successfully discern a ‘stronger’ deep network from a ‘weaker’ one even when both make identical predictions. Our code is available at https://github.com/ramprs/grad-cam/, along with a demo on CloudCV (Agrawal et al., in: Mobile cloud visual media computing, pp 265–290. Springer, 2015) (http://gradcam.cloudcv.org) and a video at http://youtu.be/COjUB9Izk6E.

Journal Article

Share this book

Add to My Shelf

Uncertainty modelling in deep learning for safer neuroimage enhancement: Demonstration in diffusion MRI

by Kaden, Enrico , Grussu, Francesco , Worrall, Daniel E. in Anisotropy , Bayesian analysis , Brain - diagnostic imaging

2021

•Proposes methods for modelling different types of uncertainty that arise in deep learning (DL) applications for image enhancement problems.•Demonstrates in dMRI super-resolution tasks that modelling uncertainty enhances the safety of DL-based enhancement system by bringing two categories of practical benefits:(1) “performance improvement”: e.g., the generalisation to out-of-distribution data, robustness to noise and outliers (Section 4.3)(2) “reliability assessment of prediction”: e.g., certification of performance based on uncertainty-thresholding (Section 4.4.1); detection of unfamiliar structures and understanding the sources of uncertainty (Section 4.4.2).•Provide a comprehensive set of experiments in a diverse set of datasets, which vary in demographics, scanner types, acquisition protocols or pathology.•The methods are in theory applicable to many other imaging modalities and data enhancement applications.•Codes will be available on Github. Deep learning (DL) has shown great potential in medical image enhancement problems, such as super-resolution or image synthesis. However, to date, most existing approaches are based on deterministic models, neglecting the presence of different sources of uncertainty in such problems. Here we introduce methods to characterise different components of uncertainty, and demonstrate the ideas using diffusion MRI super-resolution. Specifically, we propose to account for intrinsic uncertainty through a heteroscedastic noise model and for parameter uncertainty through approximate Bayesian inference, and integrate the two to quantify predictive uncertainty over the output image. Moreover, we introduce a method to propagate the predictive uncertainty on a multi-channelled image to derived scalar parameters, and separately quantify the effects of intrinsic and parameter uncertainty therein. The methods are evaluated for super-resolution of two different signal representations of diffusion MR images—Diffusion Tensor images and Mean Apparent Propagator MRI—and their derived quantities such as mean diffusivity and fractional anisotropy, on multiple datasets of both healthy and pathological human brains. Results highlight three key potential benefits of modelling uncertainty for improving the safety of DL-based image enhancement systems. Firstly, modelling uncertainty improves the predictive performance even when test data departs from training data (“out-of-distribution” datasets). Secondly, the predictive uncertainty highly correlates with reconstruction errors, and is therefore capable of detecting predictive “failures”. Results on both healthy subjects and patients with brain glioma or multiple sclerosis demonstrate that such an uncertainty measure enables subject-specific and voxel-wise risk assessment of the super-resolved images that can be accounted for in subsequent analysis. Thirdly, we show that the method for decomposing predictive uncertainty into its independent sources provides high-level “explanations” for the model performance by separately quantifying how much uncertainty arises from the inherent difficulty of the task or the limited training examples. The introduced concepts of uncertainty modelling extend naturally to many other imaging modalities and data enhancement applications.

Journal Article

Share this book

Add to My Shelf

Explainability of Deep Vision-Based Autonomous Driving Systems: Review and Challenges

by Cord, Matthieu , Ben-Younes, Hédi , Zablocki, Éloi in Accountability , Autonomous vehicles , Cloning

2022

This survey reviews explainability methods for vision-based self-driving systems trained with behavior cloning. The concept of explainability has several facets and the need for explainability is strong in driving, a safety-critical application. Gathering contributions from several research fields, namely computer vision, deep learning, autonomous driving, explainable AI (X-AI), this survey tackles several points. First, it discusses definitions, context, and motivation for gaining more interpretability and explainability from self-driving systems, as well as the challenges that are specific to this application. Second, methods providing explanations to a black-box self-driving system in a post-hoc fashion are comprehensively organized and detailed. Third, approaches from the literature that aim at building more interpretable self-driving systems by design are presented and discussed in detail. Finally, remaining open-challenges and potential future research directions are identified and examined.

Journal Article

Share this book

Add to My Shelf

The importance of interpretability and visualization in machine learning for applications in medicine and health care

by Vellido, Alfredo in Artificial Intelligence , Computational Biology/Bioinformatics , Computational Science and Engineering

2020

In a short period of time, many areas of science have made a sharp transition towards data-dependent methods. In some cases, this process has been enabled by simultaneous advances in data acquisition and the development of networked system technologies. This new situation is particularly clear in the life sciences, where data overabundance has sparked a flurry of new methodologies for data management and analysis. This can be seen as a perfect scenario for the use of machine learning and computational intelligence techniques to address problems in which more traditional data analysis approaches might struggle. But, this scenario also poses some serious challenges. One of them is model interpretability and explainability, especially for complex nonlinear models. In some areas such as medicine and health care, not addressing such challenge might seriously limit the chances of adoption, in real practice, of computer-based systems that rely on machine learning and computational intelligence methods for data analysis. In this paper, we reflect on recent investigations about the interpretability and explainability of machine learning methods and discuss their impact on medicine and health care. We pay specific attention to one of the ways in which interpretability and explainability in this context can be addressed, which is through data and model visualization. We argue that, beyond improving model interpretability as a goal in itself, we need to integrate the medical experts in the design of data analysis interpretation strategies. Otherwise, machine learning is unlikely to become a part of routine clinical and health care practice.

Journal Article

Share this book

Add to My Shelf

A Survey on Explainable Artificial Intelligence (XAI) Techniques for Visualizing Deep Learning Models in Medical Imaging

by Amiruzzaman, Md , Neha, Fnu , Bhati, Deepshikha in Artificial intelligence , Decision making , Deep learning

2024

The combination of medical imaging and deep learning has significantly improved diagnostic and prognostic capabilities in the healthcare domain. Nevertheless, the inherent complexity of deep learning models poses challenges in understanding their decision-making processes. Interpretability and visualization techniques have emerged as crucial tools to unravel the black-box nature of these models, providing insights into their inner workings and enhancing trust in their predictions. This survey paper comprehensively examines various interpretation and visualization techniques applied to deep learning models in medical imaging. The paper reviews methodologies, discusses their applications, and evaluates their effectiveness in enhancing the interpretability, reliability, and clinical relevance of deep learning models in medical image analysis.

Journal Article

Share this book

Add to My Shelf

A Comprehensive Review of Explainable Artificial Intelligence (XAI) in Computer Vision

by Cai, Lingfeng , Li, Yule , Cheng, Zhihan in Algorithms , Artificial Intelligence , Comparative analysis

2025

Explainable Artificial Intelligence (XAI) is increasingly important in computer vision, aiming to connect complex model outputs with human understanding. This review provides a focused comparative analysis of representative XAI methods in four main categories, attribution-based, activation-based, perturbation-based, and transformer-based approaches, selected from a broader literature landscape. Attribution-based methods like Grad-CAM highlight key input regions using gradients and feature activation. Activation-based methods analyze the responses of internal neurons or feature maps to identify which parts of the input activate specific layers or units, helping to reveal hierarchical feature representations. Perturbation-based techniques, such as RISE, assess feature importance through input modifications without accessing internal model details. Transformer-based methods, which use self-attention, offer global interpretability by tracing information flow across layers. We evaluate these methods using metrics such as faithfulness, localization accuracy, efficiency, and overlap with medical annotations. We also propose a hierarchical taxonomy to classify these methods, reflecting the diversity of XAI techniques. Results show that RISE has the highest faithfulness but is computationally expensive, limiting its use in real-time scenarios. Transformer-based methods perform well in medical imaging, with high IoU scores, though interpreting attention maps requires care. These findings emphasize the need for context-aware evaluation and hybrid XAI methods balancing interpretability and efficiency. The review ends by discussing ethical and practical challenges, stressing the need for standard benchmarks and domain-specific tuning.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter