Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
99
result(s) for
"feature attribution"
Sort by:
SHAP-Based Interpretable Object Detection Method for Satellite Imagery
2022
There is a growing need for algorithms to automatically detect objects in satellite images. Object detection algorithms using deep learning have demonstrated a significant improvement in object detection performance. However, deep-learning models have difficulty in interpreting the features for inference. This difficulty is practically problematic when analyzing earth-observation images, which are often used as evidence for public decision-making. In addition, for the same reason, it is difficult to set an explicit policy or criteria to improve the models. To deal with these challenges, we introduce a feature attribution method that defines an approximate model and calculates the attribution of input features to the output of a deep-learning model. For the object detection models of satellite images with complex textures, we propose a method to visualize the basis of inference using pixel-wise feature attribution. Furthermore, we propose new methods for model evaluation, regularization, and data selection, based on feature attribution. Experimental results demonstrate the feasibility of the proposed methods for basis visualization and model evaluation. Moreover, the results illustrate that the model using the proposed regularization method can avoid over-fitting and achieve higher performance, and the proposed data selection method allows for the efficient selection of new training data.
Journal Article
Evaluating feature attribution methods in the image domain
by
Gevaert, Arne
,
Saeys, Yvan
,
Valkenborg, Dirk
in
Artificial Intelligence
,
Computer Science
,
Control
2024
Feature attribution maps are a popular approach to highlight the most important pixels in an image for a given prediction of a model. Despite a recent growth in popularity and available methods, the objective evaluation of such attribution maps remains an open problem. Building on previous work in this domain, we investigate existing quality metrics and propose new variants of metrics for the evaluation of attribution maps. We confirm a recent finding that different quality metrics seem to measure different underlying properties of attribution maps, and extend this finding to a larger selection of attribution methods, quality metrics, and datasets. We also find that metric results on one dataset do not necessarily generalize to other datasets, and methods with desirable theoretical properties do not necessarily outperform computationally cheaper alternatives in practice. Based on these findings, we propose a general benchmarking approach to help guide the selection of attribution methods for a given use case. Implementations of attribution metrics and our experiments are available online (
https://github.com/arnegevaert/benchmark-general-imaging
).
Graphical abstract
Journal Article
PAUSE: principled feature attribution for unsupervised gene expression analysis
by
Lee, Ting-I
,
Janizek, Joseph D.
,
Spiro, Anna
in
Alzheimer's disease
,
Animal Genetics and Genomics
,
architecture
2023
As interest in using unsupervised deep learning models to analyze gene expression data has grown, an increasing number of methods have been developed to make these models more interpretable. These methods can be separated into two groups: post hoc analyses of black box models through feature attribution methods and approaches to build inherently interpretable models through biologically-constrained architectures. We argue that these approaches are not mutually exclusive, but can in fact be usefully combined. We propose PAUSE (
https://github.com/suinleelab/PAUSE
), an unsupervised pathway attribution method that identifies major sources of transcriptomic variation when combined with biologically-constrained neural network models.
Journal Article
Pseudo datasets estimate feature attribution in artificial neural networks
2025
Neural networks demonstrate exceptional predictive performance across diverse classification tasks. However, their lack of interpretability restricts their widespread application. Consequently, in recent years, numerous researchers have focused on model explanation techniques to elucidate the internal mechanisms of these ‘black box’ models. Yet, prevailing explanation methods predominantly focus on elucidating individual features, thereby overlooking synergistic effects and interactions among multiple features, potentially hindering a comprehensive understanding of the model’s predictive behavior. Therefore, this study proposes a two-stage explanation method, known as Pseudo Datasets Perturbation Effect (PDPE). The fundamental concept is to discern feature importance by perturbing the data and observing its influence on prediction outcomes. Under structured data, this method identifies potential feature interactions while evaluating the relative significance of individual features and their interaction terms. Compared with the widely recognized SHAP Value method, our computer simulation studies within the context of neural networks approximating the linear association of logistic regression demonstrate that PDPE provides faster, more accurate explanations. PDPE helps users understand the significance of individual features and their interactions for model predictions. Additionally, using real-life data from the National Institute of Diabetes and Digestive and Kidney Diseases, the analysis results also show the superior performance of the new approach.
Journal Article
TF-LIME : Interpretation Method for Time-Series Models Based on Time–Frequency Features
2025
With the widespread application of machine learning techniques in time series analysis, the interpretability of models trained on time series data has attracted increasing attention. Most existing explanation methods are based on time-domain features, making it difficult to reveal how complex models focus on time–frequency information. To address this, this paper proposes a time–frequency domain-based time series interpretation method aimed at enhancing the interpretability of models at the time–frequency domain. This method extends the traditional LIME algorithm by combining the ideas of short-time Fourier transform (STFT), inverse STFT, and local interpretable model-agnostic explanations (LIME), and introduces a self-designed TFHS (time–frequency homogeneous segmentation) algorithm. The TFHS algorithm achieves precise homogeneous segmentation of the time–frequency matrix through peak detection and clustering analysis, incorporating the distribution characteristics of signals in both frequency and time dimensions. The experiment verified the effectiveness of the TFHS algorithm on Synthetic Dataset 1 and the effectiveness of the TF-LIME algorithm on Synthetic Dataset 2, and then further evaluated the interpretability performance on the MIT-BIH dataset. The results demonstrate that the proposed method significantly improves the interpretability of time-series models in the time–frequency domain, exhibiting strong generalization capabilities and promising application prospects.
Journal Article
TSInsight: A Local-Global Attribution Framework for Interpretability in Time Series Data
by
Dengel, Andreas
,
Siddiqui, Shoaib Ahmed
,
Mercier, Dominique
in
auto-encoder
,
Datasets
,
Deep learning
2021
With the rise in the employment of deep learning methods in safety-critical scenarios, interpretability is more essential than ever before. Although many different directions regarding interpretability have been explored for visual modalities, time series data has been neglected, with only a handful of methods tested due to their poor intelligibility. We approach the problem of interpretability in a novel way by proposing TSInsight, where we attach an auto-encoder to the classifier with a sparsity-inducing norm on its output and fine-tune it based on the gradients from the classifier and a reconstruction penalty. TSInsight learns to preserve features that are important for prediction by the classifier and suppresses those that are irrelevant, i.e., serves as a feature attribution method to boost the interpretability. In contrast to most other attribution frameworks, TSInsight is capable of generating both instance-based and model-based explanations. We evaluated TSInsight along with nine other commonly used attribution methods on eight different time series datasets to validate its efficacy. The evaluation results show that TSInsight naturally achieves output space contraction; therefore, it is an effective tool for the interpretability of deep time series models.
Journal Article
An Ensemble of Global and Local-Attention Based Convolutional Neural Networks for COVID-19 Diagnosis on Chest X-ray Images
by
Afifi, Ahmed
,
Hafsa, Noor E
,
Alhumam, Abdulaziz
in
Algorithms
,
Artificial intelligence
,
Artificial neural networks
2021
The recent Coronavirus Disease 2019 (COVID-19) pandemic has put a tremendous burden on global health systems. Medical practitioners are under great pressure for reliable screening of suspected cases employing adjunct diagnostic tools to standard point-of-care testing methodology. Chest X-rays (CXRs) are appearing as a prospective diagnostic tool with easy-to-acquire, low-cost and less cross-contamination risk features. Artificial intelligence (AI)-attributed CXR evaluation has shown great potential for distinguishing COVID-19-induced pneumonia from other associated clinical instances. However, one of the associated challenges with diagnostic imaging-based modeling is incorrect feature attribution, which leads the model to learn misguiding disease patterns, causing wrong predictions. Here, we demonstrate an effective deep learning-based methodology to mitigate the problem, thereby allowing the classification algorithm to learn from relevant features. The proposed deep-learning framework consists of an ensemble of convolutional neural network (CNN) models focusing on both global and local pathological features from CXR lung images, while the latter is extracted using a multi-instance learning scheme and a local attention mechanism. An inspection of a series of backbone CNN models using global and local features, and an ensemble of both features, trained from high-quality CXR images of 1311 patients, further augmented for achieving the symmetry in class distribution, to localize lung pathological features followed by the classification of COVID-19 and other related pneumonia, shows that a DenseNet161 architecture outperforms all other models, as evaluated on an independent test set of 159 patients with confirmed cases. Specifically, an ensemble of DenseNet161 models with global and local attention-based features achieve an average balanced accuracy of 91.2%, average precision of 92.4%, and F1-score of 91.9% in a multi-label classification framework comprising COVID-19, pneumonia, and control classes. The DenseNet161 ensembles were also found to be statistically significant from all other models in a comprehensive statistical analysis. The current study demonstrated that the proposed deep learning-based algorithm can accurately identify the COVID-19-related pneumonia in CXR images, along with differentiating non-COVID-19-associated pneumonia with high specificity, by effectively alleviating the incorrect feature attribution problem, and exploiting an enhanced feature descriptor.
Journal Article
Label-Only Membership Inference Attack Based on Model Explanation
2024
It is well known that machine learning models (e.g., image recognition) can unintentionally leak information about the training set. Conventional membership inference relies on posterior vectors, and this task becomes extremely difficult when the posterior is masked. However, current label-only membership inference attacks require a large number of queries during the generation of adversarial samples, and thus incorrect inference generates a large number of invalid queries. Therefore, we introduce a label-only membership inference attack based on model explanations. It can transform a label-only attack into a traditional membership inference attack by observing neighborhood consistency and perform fine-grained membership inference for vulnerable samples. We use feature attribution to simplify the high-dimensional neighborhood sampling process, quickly identify decision boundaries and recover a posteriori vectors. It also compares different privacy risks faced by different samples through finding vulnerable samples. The method is validated on CIFAR-10, CIFAR-100 and MNIST datasets. The results show that membership attributes can be identified even using a simple sampling method. Furthermore, vulnerable samples expose the model to greater privacy risks.
Journal Article
Game-Theoretic Explainable AI for Ensemble-Boosting Models in Early Malware Prediction for Computer Systems
by
Sakhamuri, Mallikharjuna Rao
,
Rathnayake, Upaka
,
Henna, Shagufta
in
AnchorTabular
,
Artificial Intelligence
,
Computational Intelligence
2025
Malware continues to pose a critical threat to computing systems, with modern techniques often bypassing traditional signature-based defenses. Ensemble-boosting classifiers, including GBC, XGBoost, AdaBoost, LightGBM, and CatBoost, have shown strong predictive performance for malware detection, yet their “black-box” nature limits transparency, interpretability, and trust, all of which are essential for deployment in high-stakes cybersecurity environments. This paper proposes a unified explainable AI (XAI) framework to address these challenges by improving the interpretability, fairness, transparency, and efficiency of ensemble-boosting models in malware and intrusion detection tasks. The framework integrates SHAP for global feature importance and complex interaction analysis; LIME for local, instance-level explanations; and DALEX for fairness auditing across sensitive attributes, ensuring that predictions remain both equitable and meaningful across diverse user populations. We rigorously evaluate the framework on a large-scale, balanced dataset derived from Microsoft Windows Defender telemetry, covering various types of malware. Experimental results demonstrate that the unified XAI approach not only achieves high malware detection accuracy but also uncovers complex feature interactions, such as the combined effects of system configuration and security states. To establish generalization, we further validate the framework on the CICIDS-2017 intrusion detection dataset, where it successfully adapts to different network threat patterns, highlighting its robustness across distinct cybersecurity domains. Comparative experiments against state-of-the-art XAI tools, including AnchorTabular (rule-based explanations) and Fairlearn (fairness-focused analysis), reveal that the proposed framework consistently delivers deeper insights into model behavior, achieves better fairness metrics, and reduces explanation overhead. By combining global and local interpretability, fairness assurance, and computational optimizations, this unified XAI framework offers a scalable, human-understandable, and trustworthy solution for deploying ensemble-boosting models in real-world malware detection and intrusion prevention systems.
Journal Article