Catalogue Search | MBRL

Knowledge-Injected Transformer (KIT): A Modular Encoder–Decoder Architecture for Efficient Knowledge Integration and Reliable Question Answering

by Maksymenko, Daniil , Turuta, Olena , Kirichenko, Lyudmyla in Datasets , Electric transformers , encoder–decoder architecture

2026

Decoder-only language models (LMs) store factual knowledge directly in their parameters, resulting in large model sizes, costly retraining when facts change, and limited controllability in knowledge-intensive information systems. These models frequently mix stored knowledge with user-provided context, which leads to hallucinations and reduces reliability. To address these limitations, we propose KIT (Knowledge-Injected Transformer), a modular encoder–decoder architecture that separates syntactic competence from factual knowledge representation. In KIT, the decoder is pre-trained on knowledge-agnostic narrative corpora to learn language structure, while the encoder is trained independently to compress structured facts into compact latent representations. During joint training, the decoder learns to decompress these representations and generate accurate, fact-grounded responses. The modular design provides three key benefits: (1) factual knowledge can be updated by retraining only the encoder, without modifying decoder weights; (2) strict domain boundaries can be enforced, the modular design provides a structural foundation for reducing knowledge source confusion and hallucinations, with its actual effectiveness awaiting future validation on standard hallucination benchmarks; and (3) interpretability is improved because each generated token can be traced back to encoder activations. A real-world experimental evaluation demonstrates that KIT achieves competitive answer accuracy while offering superior controllability and substantially lower update costs compared to decoder-only baselines. These results indicate that modular encoder–decoder architectures represent a promising and reliable alternative for explainable, adaptable, and domain-specific question answering in modern information systems.

Journal Article

Share this book

Add to My Shelf

Interpretable Conversation Routing via the Latent Embeddings Approach

by Maksymenko, Daniil , Turuta, Oleksii in Analysis , Architecture , benchmark

2024

Large language models (LLMs) are quickly implemented to answer question and support systems to automate customer experience across all domains, including medical use cases. Models in such environments should solve multiple problems like general knowledge questions, queries to external sources, function calling and many others. Some cases might not even require a full-on text generation. They possibly need different prompts or even different models. All of it can be managed by a routing step. This paper focuses on interpretable few-shot approaches for conversation routing like latent embeddings retrieval. The work here presents a benchmark, a sorrow analysis, and a set of visualizations of the way latent embeddings routing works for long-context conversations in a multilingual, domain-specific environment. The results presented here show that the latent embeddings router is able to achieve performance on the same level as LLM-based routers with additional interpretability and higher level of control over model decision-making.

Journal Article

Share this book

Add to My Shelf

Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning

by Truică, Ciprian-Octavian , Lloret, Elena , Apostol, Elena-Simona in Application , Artificial intelligence , Automatic speech recognition

2022

Developing artificial learning systems that can understand and generate natural language has been one of the long-standing goals of artificial intelligence. Recent decades have witnessed an impressive progress on both of these problems, giving rise to a new family of approaches. Especially, the advances in deep learning over the past couple of years have led to neural approaches to natural language generation (NLG). These methods combine generative language learning techniques with neural-networks based frameworks. With a wide range of applications in natural language processing, neural NLG (NNLG) is a new and fast growing field of research. In this state-of-the-art report, we investigate the recent developments and applications of NNLG in its full extent from a multidimensional view, covering critical perspectives such as multimodality, multilinguality, controllability and learning strategies. We summarize the fundamental building blocks of NNLG approaches from these aspects and provide detailed reviews of commonly used preprocessing steps and basic neural architectures. This report also focuses on the seminal applications of these NNLG models such as machine translation, description generation, automatic speech recognition, abstractive summarization, text simplification, question answering and generation, and dialogue generation. Finally, we conclude with a thorough discussion of the described frameworks by pointing out some open research directions.

Journal Article

Share this book

Add to My Shelf

pymovements: A Python Package for Eye Movement Data Processing

by Chwastek, Jakob , Süss, Assunta , Reich, David R in Best practice , Data processing , Eye movements

2023

We introduce pymovements: a Python package for analyzing eye-tracking data that follows best practices in software development, including rigorous testing and adherence to coding standards. The package provides functionality for key processes along the entire preprocessing pipeline. This includes parsing of eye tracker data files, transforming positional data into velocity data, detecting gaze events like saccades and fixations, computing event properties like saccade amplitude and fixational dispersion and visualizing data and results with several types of plotting methods. Moreover, pymovements also provides an easily accessible interface for downloading and processing publicly available datasets. Additionally, we emphasize how rigorous testing in scientific software packages is critical to the reproducibility and transparency of research, enabling other researchers to verify and build upon previous findings.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter