Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
35
result(s) for
"interpretable reasoning"
Sort by:
IMF: Interpretable Multi-Hop Forecasting on Temporal Knowledge Graphs
2023
Temporal knowledge graphs (KGs) have recently attracted increasing attention. The temporal KG forecasting task, which plays a crucial role in such applications as event prediction, predicts future links based on historical facts. However, current studies pay scant attention to the following two aspects. First, the interpretability of current models is manifested in providing reasoning paths, which is an essential property of path-based models. However, the comparison of reasoning paths in these models is operated in a black-box fashion. Moreover, contemporary models utilize separate networks to evaluate paths at different hops. Although the network for each hop has the same architecture, each network achieves different parameters for better performance. Different parameters cause identical semantics to have different scores, so models cannot measure identical semantics at different hops equally. Inspired by the observation that reasoning based on multi-hop paths is akin to answering questions step by step, this paper designs an Interpretable Multi-Hop Reasoning (IMR) framework based on consistent basic models for temporal KG forecasting. IMR transforms reasoning based on path searching into stepwise question answering. In addition, IMR develops three indicators according to the characteristics of temporal KGs and reasoning paths: the question matching degree, answer completion level, and path confidence. IMR can uniformly integrate paths of different hops according to the same criteria; IMR can provide the reasoning paths similarly to other interpretable models and further explain the basis for path comparison. We instantiate the framework based on common embedding models such as TransE, RotatE, and ComplEx. While being more explainable, these instantiated models achieve state-of-the-art performance against previous models on four baseline datasets.
Journal Article
Temporal Knowledge Graph Reasoning: Completion with Semantic–Structural Fusion and Forecasting with an Interpretable Dual Decoder
2026
Temporal knowledge graphs (TKGs) effectively represent dynamic facts by incorporating a temporal dimension, yet they frequently encounter data incompleteness issues that constrain downstream applications. Concurrently, TKG prediction tasks, which enable reasoning about future events, have garnered significant attention. Existing TKG completion methods often neglect semantic information, underexploit event information from subsequent timestamps, and fail to leverage the structural symmetries inherent in temporal data. To address these limitations, this paper proposes a synergistic approach comprising two models: SiSe for completion and DL-CompGCN for prediction. SiSe integrates semantic and structural embeddings by employing entity text descriptions as semantic signals, utilizing symmetric cross-attention for bidirectional feature fusion and leveraging bidirectional gated recurrent units to capture symmetric temporal influences from both past and future events. On ICEWS14, ICEWS05-15, and GDELT completion datasets, the MRR improves by 1.2, 1.4, and 0.8 percentage points, respectively. DL-CompGCN addresses the accuracy–interpretability trade-off in prediction tasks through a time-aware graph convolutional encoder and a dual-decoder framework that combines bilinear scoring with first-order logical rules to generate interpretable paths while preserving the symmetric properties of temporal relations. It achieves state-of-the-art performance on ICEWS14, ICEWS05-15, and ICEWS18 prediction datasets. The proposed models explicitly incorporate symmetric principles in their architectural design; SiSe employs symmetric bidirectional temporal modeling, while DL-CompGCN maintains symmetry in its graph propagation and rule inference mechanisms. The experimental results demonstrate that both models significantly outperform baseline methods, offering a comprehensive solution for temporal knowledge graph reasoning that respects and exploits the symmetric structures inherent in temporal data.
Journal Article
Explainable neural computation via stack neural module networks
by
Saenko, Kate
,
Andreas, Jacob
,
Darrell, Trevor
in
Artificial intelligence
,
Information processing
,
International conferences
2021
In complex inferential tasks like question answering, machine learning models must confront two challenges: the need to implement a compositional reasoning process, and, in many applications, the need for this reasoning process to be interpretable to assist users in both development and prediction. Existing models designed to produce interpretable traces of their decision‐making process typically require these traces to be supervised at training time. In this paper, we present a novel neural modular approach that performs compositional reasoning by automatically inducing a desired subtask decomposition without relying on strong supervision. Our model allows linking different reasoning tasks through shared modules that handle common routines across tasks. Experiments show that the model is more interpretable to human evaluators compared to other state‐of‐the‐art models: users can better understand the model's underlying reasoning procedure and predict when it will succeed or fail based on observing its intermediate outputs. We present a novel neural modular approach that performs compositional reasoning by automatically inducing a desired subtask decomposition without relying on strong supervision, being fully differentiable and more interpretable to human evaluators.
Journal Article
Explainability with Association Rule Learning for Weather Forecast
by
Kamsu-Foguem, Bernard
,
Coulibaly, Lassana
,
Tangara, Fana
in
Algorithms
,
Associations
,
Atmospheric models
2021
The reliability of the weather forecast models is a complex issue since it depends on numerous parameters and the technical infrastructure which supports them. In doing so, there is a need for advanced works oriented towards a better understanding of these models and the analysis of main associated parameters. Our approach is to study the applicability of the extracted association rules to provide a clearer understanding of atmospheric exchanges. In this work, the proposed methodology is based on the discovery of the interesting interpretable relationships between measured meteorological parameters at the Atmospheric Research Center of Lannemezan (South-West of France). In the preprocessing step, the proposed method is considered to be effectively flexible to account for data uncertainties, unlike the majority of classical evaluation methods mainly directed towards the reduction of variables and data redundancy. In postprocessing, the advantage of our approach is that the extracted rules are a metamodeling of interpretable useful knowledge for the clarity and conciseness of its representation. Moreover, in the processing, the interpretability in data sciences is recent and still in its infancy. The generated association rules with their statistical and semantic interpretations have globally highlighted the possibilities of explicit analysis of meteorological parameters. This study showed that among the generated relevant rules, three parameters (temperature, humidity, wind speed) have a high frequency in the antecedents of the rules and that the only consequence is rain. This is useful for the identification of potential improvements and gaps in the existing models of atmospheric observations, in particular, to understand the related parameterizations to the productivity of the rain phenomenon.
Journal Article
Counterfactual explanations and how to find them: literature review and benchmarking
2024
Interpretable machine learning aims at unveiling the reasons behind predictions returned by uninterpretable classifiers. One of the most valuable types of explanation consists of counterfactuals. A counterfactual explanation reveals what should have been different in an instance to observe a diverse outcome. For instance, a bank customer asks for a loan that is rejected. The counterfactual explanation consists of what should have been different for the customer in order to have the loan accepted. Recently, there has been an explosion of proposals for counterfactual explainers. The aim of this work is to survey the most recent explainers returning counterfactual explanations. We categorize explainers based on the approach adopted to return the counterfactuals, and we label them according to characteristics of the method and properties of the counterfactuals returned. In addition, we visually compare the explanations, and we report quantitative benchmarking assessing minimality, actionability, stability, diversity, discriminative power, and running time. The results make evident that the current state of the art does not provide a counterfactual explainer able to guarantee all these properties simultaneously.
Journal Article
XGBoost-SHAP-based interpretable diagnostic framework for alzheimer’s disease
by
Bai, Wenlin
,
Yi, Fuliang
,
Qin, Yao
in
Accuracy
,
Algorithms
,
Alzheimer Disease - diagnostic imaging
2023
Background
Due to the class imbalance issue faced when Alzheimer’s disease (AD) develops from normal cognition (NC) to mild cognitive impairment (MCI), present clinical practice is met with challenges regarding the auxiliary diagnosis of AD using machine learning (ML). This leads to low diagnosis performance. We aimed to construct an interpretable framework, extreme gradient boosting-Shapley additive explanations (XGBoost-SHAP), to handle the imbalance among different AD progression statuses at the algorithmic level. We also sought to achieve multiclassification of NC, MCI, and AD.
Methods
We obtained patient data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database, including clinical information, neuropsychological test results, neuroimaging-derived biomarkers, and APOE-ε4 gene statuses. First, three feature selection algorithms were applied, and they were then included in the XGBoost algorithm. Due to the imbalance among the three classes, we changed the sample weight distribution to achieve multiclassification of NC, MCI, and AD. Then, the SHAP method was linked to XGBoost to form an interpretable framework. This framework utilized attribution ideas that quantified the impacts of model predictions into numerical values and analysed them based on their directions and sizes. Subsequently, the top 10 features (optimal subset) were used to simplify the clinical decision-making process, and their performance was compared with that of a random forest (RF), Bagging, AdaBoost, and a naive Bayes (NB) classifier. Finally, the National Alzheimer’s Coordinating Center (NACC) dataset was employed to assess the impact path consistency of the features within the optimal subset.
Results
Compared to the RF, Bagging, AdaBoost, NB and XGBoost (unweighted), the interpretable framework had higher classification performance with accuracy improvements of 0.74%, 0.74%, 1.46%, 13.18%, and 0.83%, respectively. The framework achieved high sensitivity (81.21%/74.85%), specificity (92.18%/89.86%), accuracy (87.57%/80.52%), area under the receiver operating characteristic curve (AUC) (0.91/0.88), positive clinical utility index (0.71/0.56), and negative clinical utility index (0.75/0.68) on the ADNI and NACC datasets, respectively. In the ADNI dataset, the top 10 features were found to have varying associations with the risk of AD onset based on their SHAP values. Specifically, the higher SHAP values of
CDRSB
,
ADAS13
,
ADAS11
,
ventricle volume
,
ADASQ4
, and
FAQ
were associated with higher risks of AD onset. Conversely, the higher SHAP values of
LDELTOTAL
,
mPACCdigit
,
RAVLT_immediate
, and
MMSE
were associated with lower risks of AD onset. Similar results were found for the NACC dataset.
Conclusions
The proposed interpretable framework contributes to achieving excellent performance in imbalanced AD multiclassification tasks and provides scientific guidance (optimal subset) for clinical decision-making, thereby facilitating disease management and offering new research ideas for optimizing AD prevention and treatment programs.
Journal Article
Learning classification models of cognitive conditions from subtle behaviors in the digital Clock Drawing Test
by
Penney, Dana L.
,
Souillard-Mandar, William
,
Davis, Randall
in
Artificial Intelligence
,
Classification
,
Clocks
2016
The Clock Drawing Test—a simple pencil and paper test—has been used for more than 50 years as a screening tool to differentiate normal individuals from those with cognitive impairment, and has proven useful in helping to diagnose cognitive dysfunction associated with neurological disorders such as Alzheimer’s disease, Parkinson’s disease, and other dementias and conditions. We have been administering the test using a digitizing ballpoint pen that reports its position with considerable spatial and temporal precision, making available far more detailed data about the subject’s performance. Using pen stroke data from these drawings categorized by our software, we designed and computed a large collection of features, then explored the tradeoffs in performance and interpretability in classifiers built using a number of different subsets of these features and a variety of different machine learning techniques. We used traditional machine learning methods to build prediction models that achieve high accuracy. We operationalized widely used manual scoring systems so that we could use them as benchmarks for our models. We worked with clinicians to define guidelines for model interpretability, and constructed sparse linear models and rule lists designed to be as easy to use as scoring systems currently used by clinicians, but more accurate. While our models will require additional testing for validation, they offer the possibility of substantial improvement in detecting cognitive impairment earlier than currently possible, a development with considerable potential impact in practice.
Journal Article
Nomological Deductive Reasoning for Trustworthy, Human-Readable, and Actionable AI Outputs
by
Hakizimana, Gedeon
,
Ledezma Espino, Agapito
in
Artificial intelligence
,
Cognition & reasoning
,
Decision making
2025
The lack of transparency in many AI systems continues to hinder their adoption in critical domains such as healthcare, finance, and autonomous systems. While recent explainable AI (XAI) methods—particularly those leveraging large language models—have enhanced output readability, they often lack traceable and verifiable reasoning that is aligned with domain-specific logic. This paper presents Nomological Deductive Reasoning (NDR), supported by Nomological Deductive Knowledge Representation (NDKR), as a framework aimed at improving the transparency and auditability of AI decisions through the integration of formal logic and structured domain knowledge. NDR enables the generation of causal, rule-based explanations by validating statistical predictions against symbolic domain constraints. The framework is evaluated on a credit-risk classification task using the Statlog (German Credit Data) dataset, demonstrating that NDR can produce coherent and interpretable explanations consistent with expert-defined logic. While primarily focused on technical integration and deductive validation, the approach lays a foundation for more transparent and norm-compliant AI systems. This work contributes to the growing formalization of XAI by aligning statistical inference with symbolic reasoning, offering a pathway toward more interpretable and verifiable AI decision-making processes.
Journal Article
MP: motion program synthesis with machine learning interpretability and knowledge graph analogy
The advancement of physics-based engines has led to the popularity of virtual reality. To achieve a more realistic and immersive user experience, the behaviours of objects in virtual scenes are expected to conform to real-world physical laws accurately. This increases the workload and development time for developers. To facilitate development on physics-based engines, this paper proposes MP that is a motion program synthesis approach based on machine learning and analogical reasoning. MP follows the paradigm of test-driven development, where programs are generated to fit test cases of motions subject to multiple environmental factors such as gravity and airflows. To reduce the search space of code generation, regression models are used to find variables that cause significant influences to motions, while analogical reasoning on knowledge graphs is used to find operators that work for the found variables. Besides, constraint solving is used to probabilistically estimate the values of constants in motion programs. Experimental results have demonstrated that MP is efficient in various motion program generation tasks, with random forest regressors achieving low data and time requirements.
Journal Article
Interpretation drift in explainable AI under label noise
by
Raikovskaia, Alisa
,
Pianykh, Oleg S.
,
Rakhimzhanov, Nurlan
in
639/705/1042
,
639/705/1046
,
Accuracy
2026
The comprehensibility and human interpretation of classification models are crucial in many applications, such as decision support systems and knowledge discovery, where explanations drive action. However, the presence of class label noise, widespread in real-life data, can significantly impact the performance and interpretability of data models. This study addresses the problem of interpretability robustness by examining the impact of class label noise on rule-learning models – the models extensively used for discovering transparent, human-readable interpretations of hidden data patterns and decision logic. Our empirical results demonstrate that while model performance may remain stable under increasing label noise, the consistency of explainable model rules suffers significantly. As a result, we uncover a novel and critical phenomenon –
interpretation drift
– where model explanations change substantially under label noise, even when predictive performance remains stable. This phenomenon can directly impact AI-informed decisions, but is not detectable through conventional performance metrics and therefore poses significant risks in real-world applications reliant on AI explanations. Our findings emphasize the need for standardized, interpretability-aware robustness metrics in the development of trustworthy explainable AI.
Journal Article