Catalogue Search | MBRL

Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization

by Chrysostomou, George , Williams, Miles , Zhao, Zhixue in Documents , Hallucinations , Inference

2024

Despite the remarkable performance of generative large language models (LLMs) on abstractive summarization, they face two significant challenges: their considerable size and tendency to hallucinate. Hallucinations are concerning because they erode reliability and raise safety issues. Pruning is a technique that reduces model size by removing redundant weights, enabling more efficient sparse inference. Pruned models yield downstream task performance comparable to the original, making them ideal alternatives when operating on a limited budget. However, the effect that pruning has upon hallucinations in abstractive summarization with LLMs has yet to be explored. In this paper, we provide an extensive empirical study across five summarization datasets, two state-of-the-art pruning methods, and five instruction-tuned LLMs. Surprisingly, we find that hallucinations are less prevalent from pruned LLMs than the original models. Our analysis suggests that pruned models tend to depend more on the source document for summary generation. This leads to a higher lexical overlap between the generated summary and the source document, which could be a reason for the reduction in hallucination risk.

Journal Article

Share this book

Add to My Shelf

Model Interpretability for Natural Language Processing Applications

by Chrysostomou, George in Natural language processing

2022

This thesis focuses on model interpretability, an area concerned with under- standing model predictions in Natural Language Processing (NLP) tasks. The increase in adoption of opaque models, such as BERT, leads to an increasing need for explaining their predictions. This is typically performed by extract- ing a sub-set of the input, that is indicative of the true reasoning behind the model's prediction (i.e. a faithful explanation or rationale). Whilst there are multiple approaches in literature for extracting explana- tions (e.g. feature attribution methods), some faced criticism about how faith- ful they are. Furthermore, explanation faithfulness also depends on the model employed, where highly parametrised models have been shown to produce less faithful explanations. Previous research has also shown that there is no sin- gle best feature attribution method across models, tasks and even instances of the same dataset, whilst finding a rationale length is still an open problem. Additionally, a limitation of current evaluations for explanation faithfulness, is that they are performed on a held-out dataset coming from the same do- main (i.e. the data they are evaluated on, are from the same distribution as the training data). However, we are not aware how faithfulness is impacted in out-of-domain settings. The main aim of this thesis therefore, is to improve and evaluate the faith- fulness of explanations in NLP applications. First, we improve the faithfulness of explanations extracted using attention mechanisms, a popular component used in neural NLP models. In a similar direction, we show improvements on the faithfulness of explanations from feature attribution approaches, when us- ing large language models. We then address the problem of specifying a priori a feature scoring method, rationale length and type. Finally, we evaluate the faithfulness of explanations in out-of-domain settings, highlighting a problem when using popular faithfulness evaluation metrics.

Dissertation

Share this book

Add to My Shelf

Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization

by Chrysostomou, George , Williams, Miles , Zhao, Zhixue in Empirical analysis , Large language models , Pruning

2024

Despite the remarkable performance of generative large language models (LLMs) on abstractive summarization, they face two significant challenges: their considerable size and tendency to hallucinate. Hallucinations are concerning because they erode reliability and raise safety issues. Pruning is a technique that reduces model size by removing redundant weights, enabling more efficient sparse inference. Pruned models yield downstream task performance comparable to the original, making them ideal alternatives when operating on a limited budget. However, the effect that pruning has upon hallucinations in abstractive summarization with LLMs has yet to be explored. In this paper, we provide an extensive empirical study across five summarization datasets, two state-of-the-art pruning methods, and five instruction-tuned LLMs. Surprisingly, we find that hallucinations are less prevalent from pruned LLMs than the original models. Our analysis suggests that pruned models tend to depend more on the source document for summary generation. This leads to a higher lexical overlap between the generated summary and the source document, which could be a reason for the reduction in hallucination risk.

Paper

Share this book

Add to My Shelf

An Empirical Study on Explanations in Out-of-Domain Settings

by Chrysostomou, George , Aletras, Nikolaos in Domains , Natural language processing , Performance prediction

2022

Recent work in Natural Language Processing has focused on developing approaches that extract faithful explanations, either via identifying the most important tokens in the input (i.e. post-hoc explanations) or by designing inherently faithful models that first select the most important tokens and then use them to predict the correct label (i.e. select-then-predict models). Currently, these approaches are largely evaluated on in-domain settings. Yet, little is known about how post-hoc explanations and inherently faithful models perform in out-of-domain settings. In this paper, we conduct an extensive empirical study that examines: (1) the out-of-domain faithfulness of post-hoc explanations, generated by five feature attribution methods; and (2) the out-of-domain performance of two inherently faithful models over six datasets. Contrary to our expectations, results show that in many cases out-of-domain post-hoc explanation faithfulness measured by sufficiency and comprehensiveness is higher compared to in-domain. We find this misleading and suggest using a random baseline as a yardstick for evaluating post-hoc explanation faithfulness. Our findings also show that select-then predict models demonstrate comparable predictive performance in out-of-domain settings to full-text trained models.

Paper

Share this book

Add to My Shelf

Flexible Instance-Specific Rationalization of NLP Models

by Chrysostomou, George , Aletras, Nikolaos in Classification , Datasets , Feature extraction

2021

Recent research on model interpretability in natural language processing extensively uses feature scoring methods for identifying which parts of the input are the most important for a model to make a prediction (i.e. explanation or rationale). However, previous research has shown that there is no clear best scoring method across various text classification tasks while practitioners typically have to make several other ad-hoc choices regarding the length and the type of the rationale (e.g. short or long, contiguous or not). Inspired by this, we propose a simple yet effective and flexible method that allows selecting optimally for each data instance: (1) a feature scoring method; (2) the length; and (3) the type of the rationale. Our method is inspired by input erasure approaches to interpretability which assume that the most faithful rationale for a prediction should be the one with the highest difference between the model's output distribution using the full text and the text after removing the rationale as input respectively. Evaluation on four standard text classification datasets shows that our proposed method provides more faithful, comprehensive and highly sufficient explanations compared to using a fixed feature scoring method, rationale length and type. More importantly, we demonstrate that a practitioner is not required to make any ad-hoc choices in order to extract faithful rationales using our approach.

Paper

Share this book

Add to My Shelf

Self-calibration for Language Model Quantization and Pruning

by Williams, Miles , Chrysostomou, George , Aletras, Nikolaos in Calibration , Pruning , Self calibration

2025

Quantization and pruning are fundamental approaches for model compression, enabling efficient inference for language models. In a post-training setting, state-of-the-art quantization and pruning methods require calibration data, a small set of unlabeled examples. Conventionally, this is randomly sampled web text, aiming to reflect the model training data. However, this poses two key problems: (1) unrepresentative calibration examples can harm model performance, and (2) organizations increasingly avoid releasing model training data. In this paper, we propose self-calibration as a solution. Our approach requires no external data, instead leveraging the model itself to generate synthetic calibration data, with a view to better approximating the pre-training data distribution. We extensively compare the performance of self-calibration with several baselines, across a variety of models, compression methods, and tasks. Our approach proves consistently competitive in maximizing downstream task performance, frequently outperforming even using real data.

Paper

Share this book

Add to My Shelf

Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience

by Chrysostomou, George , Aletras, Nikolaos in Downstream effects , Feature extraction , Natural language processing

2021

Pretrained transformer-based models such as BERT have demonstrated state-of-the-art predictive performance when adapted into a range of natural language processing tasks. An open problem is how to improve the faithfulness of explanations (rationales) for the predictions of these models. In this paper, we hypothesize that salient information extracted a priori from the training data can complement the task-specific information learned by the model during fine-tuning on a downstream task. In this way, we aim to help BERT not to forget assigning importance to informative input tokens when making predictions by proposing SaLoss; an auxiliary loss function for guiding the multi-head attention mechanism during training to be close to salient information extracted a priori using TextRank. Experiments for explanation faithfulness across five datasets, show that models trained with SaLoss consistently provide more faithful explanations across four different feature attribution methods compared to vanilla BERT. Using the rationales extracted from vanilla BERT and SaLoss models to train inherently faithful classifiers, we further show that the latter result in higher predictive performance in downstream tasks.

Paper

Share this book

Add to My Shelf

Self-calibration for Language Model Quantization and Pruning

by Williams, Miles , Chrysostomou, George , Aletras, Nikolaos in Calibration , Pruning , Self calibration

2024

Quantization and pruning are fundamental approaches for model compression, enabling efficient inference for language models. In a post-training setting, state-of-the-art quantization and pruning methods require calibration data, a small set of unlabeled examples. Conventionally, randomly sampled web text is used, aiming to reflect the model training data. However, this poses two key problems: (1) unrepresentative calibration examples can harm model performance, and (2) organizations increasingly avoid releasing model training data. In this paper, we propose self-calibration as a solution. Our approach requires no external data, instead leveraging the model itself to generate synthetic calibration data as a better approximation of the pre-training data distribution. We extensively compare the performance of self-calibration with several baselines, across a variety of models, compression methods, and tasks. Our approach proves consistently competitive in maximizing downstream task performance, frequently outperforming even using real data.

Paper

Share this book

Add to My Shelf

Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification

by Chrysostomou, George , Aletras, Nikolaos in Classification , Coders , Computer architecture

2021

Neural network architectures in natural language processing often use attention mechanisms to produce probability distributions over input token representations. Attention has empirically been demonstrated to improve performance in various tasks, while its weights have been extensively used as explanations for model predictions. Recent studies (Jain and Wallace, 2019; Serrano and Smith, 2019; Wiegreffe and Pinter, 2019) have showed that it cannot generally be considered as a faithful explanation (Jacovi and Goldberg, 2020) across encoders and tasks. In this paper, we seek to improve the faithfulness of attention-based explanations for text classification. We achieve this by proposing a new family of Task-Scaling (TaSc) mechanisms that learn task-specific non-contextualised information to scale the original attention weights. Evaluation tests for explanation faithfulness, show that the three proposed variants of TaSc improve attention-based explanations across two attention mechanisms, five encoders and five text classification datasets without sacrificing predictive performance. Finally, we demonstrate that TaSc consistently provides more faithful attention-based explanations compared to three widely-used interpretability techniques.

Paper

Share this book

Add to My Shelf

Compressing Language Models for Specialized Domains

by Williams, Miles , Jeronymo, Vitor , Chrysostomou, George in Compressing , Computing costs

2026

Language models (LMs) excel at tasks across diverse domains, yet require substantial computational resources during inference. Compression techniques such as pruning and quantization offer a practical path towards efficient LM deployment, exemplified by their ability to preserve performance on general-purpose benchmarks. However, general-purpose LM compression methods can negatively affect performance in specialized domains (e.g. biomedical or legal). Recent work has sought to address this issue, but requires a computationally expensive full-parameter fine-tuning pipeline. To this end, we propose MixCal, a novel calibration method designed to improve the in-domain performance of compressed LMs in a post-training setting. Through extensive experimentation, we demonstrate that MixCal substantially outperforms existing approaches on domain-specific tasks and preserves general performance. Notably, these performance gains are achieved while also reducing the computational cost of LM compression.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter