Catalogue Search | MBRL

Efficient Memory-Enhanced Transformer for Long-Document Summarization in Low-Resource Regimes

by Marfia, Gustavo , Moro, Gianluca , Frisoni, Giacomo in abstractive summarization , Algorithms , Analysis

2023

Long document summarization poses obstacles to current generative transformer-based models because of the broad context to process and understand. Indeed, detecting long-range dependencies is still challenging for today’s state-of-the-art solutions, usually requiring model expansion at the cost of an unsustainable demand for computing and memory capacities. This paper introduces Emma, a novel efficient memory-enhanced transformer-based architecture. By segmenting a lengthy input into multiple text fragments, our model stores and compares the current chunk with previous ones, gaining the capability to read and comprehend the entire context over the whole document with a fixed amount of GPU memory. This method enables the model to deal with theoretically infinitely long documents, using less than 18 and 13 GB of memory for training and inference, respectively. We conducted extensive performance analyses and demonstrate that Emma achieved competitive results on two datasets of different domains while consuming significantly less GPU memory than competitors do, even in low-resource settings.

Journal Article

Share this book

Add to My Shelf

Abstractive vs. Extractive Summarization: An Experimental Review

by Giarelis, Nikolaos , Mastrokostas, Charalampos , Karacapilidis, Nikos in abstractive summarization , Algorithms , Computational linguistics

2023

Text summarization is a subtask of natural language processing referring to the automatic creation of a concise and fluent summary that captures the main ideas and topics from one or multiple documents. Earlier literature surveys focus on extractive approaches, which rank the top-n most important sentences in the input document and then combine them to form a summary. As argued in the literature, the summaries of these approaches do not have the same lexical flow or coherence as summaries that are manually produced by humans. Newer surveys elaborate abstractive approaches, which generate a summary with potentially new phrases and sentences compared to the input document. Generally speaking, contrary to the extractive approaches, the abstractive ones create summaries that are more similar to those produced by humans. However, these approaches still lack the contextual representation needed to form fluent summaries. Recent advancements in deep learning and pretrained language models led to the improvement of many natural language processing tasks, including abstractive summarization. Overall, these surveys do not present a comprehensive evaluation framework that assesses the aforementioned approaches. Taking the above into account, the contribution of this survey is fourfold: (i) we provide a comprehensive survey of the state-of-the-art approaches in text summarization; (ii) we conduct a comparative evaluation of these approaches, using well-known datasets from the related literature, as well as popular evaluation scores such as ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-LSUM, BLEU-1, BLEU-2 and SACREBLEU; (iii) we report on insights gained on various aspects of the text summarization process, including existing approaches, datasets and evaluation methods, and we outline a set of open issues and future research directions; (iv) we upload the datasets and the code used in our experiments in a public repository, aiming to increase the reproducibility of this work and facilitate future research in the field.

Journal Article

Share this book

Add to My Shelf

Automatic Text Summarization Methods: A Comprehensive Review

by Sharma, Deepak , Sharma, Grishma in Audiences , Computer Imaging , Computer Science

2023

Text summarization is the process of condensing a long text into a shorter version by maintaining the key information and its meaning. Automatic text summarization can save time and helps in selecting the important and relevant sentences from the document. In extractive summarization techniques, sentences are picked up directly from the source document, whereas in abstractive summarization techniques, new sentences and phrases are generated from original source document. Majority of research is focused on extractive techniques, in the recent years, most of the research is inclined towards abstractive & hybrid text summarization methods. This paper presents comprehensive survey on various works performed for automatic text summarization methods. Detailed study is done on the Extractive & Abstractive techniques & their comparison on different aspects. The paper also discusses the research gaps & challenges that can motivate & help researchers to identify the potential areas for research in this field.

Journal Article

Share this book

Add to My Shelf

Multi-language transfer learning for low-resource legal case summarization

by Moro, Gianluca , Italiani, Paolo , Ragazzi, Luca in Artificial intelligence , Automatic summarization , Case reports

2024

Analyzing and evaluating legal case reports are labor-intensive tasks for judges and lawyers, who usually base their decisions on report abstracts, legal principles, and commonsense reasoning. Thus, summarizing legal documents is time-consuming and requires excellent human expertise. Moreover, public legal corpora of specific languages are almost unavailable. This paper proposes a transfer learning approach with extractive and abstractive techniques to cope with the lack of labeled legal summarization datasets, namely a low-resource scenario. In particular, we conducted extensive multi- and cross-language experiments. The proposed work outperforms the state-of-the-art results of extractive summarization on the Australian Legal Case Reports dataset and sets a new baseline for abstractive summarization. Finally, syntactic and semantic metrics assessments have been carried out to evaluate the accuracy and the factual consistency of the machine-generated legal summaries.

Journal Article

Share this book

Add to My Shelf

Context aware hierarchical attention for abstractive dialogue summarization

by Wang, Ye , Qi, Yichen , Yang, Niya in 639/705/117 , 639/705/258 , Abstractive Summarization

2025

ive dialogue summarization has gained increasing attention due to its ability to generate concise and informative summaries from complex conversational data. In social dialogues, phenomena like ellipsis and topic shifts frequently occur, making it essential to account for the rich contextual information embedded at multiple levels. Traditional transformer-based models often fail to fully exploit this multi-level context. To address this limitation, we propose a novel Hierarchical Context-aware Attention (HCAtt) network. Our model incorporates both segment-level and utterance-level contextual information into the transformer framework, enhancing the model’s ability to capture the intricate dependencies in dialogue data. Specifically, we hierarchically integrate these levels during the calculation of query and key transformations, which improves the modeling of contextual relationships across token representations. Experimental results on the benchmark SAMSum, DialogSum and AMI datasets demonstrate that HCAtt outperforms existing methods, highlighting its effectiveness in handling the complexities of dialogue summarization.

Journal Article

Share this book

Add to My Shelf

PEGASUS-XL with saliency-guided scoring and long-input encoding for multi-document abstractive summarization

by Alfadhli, Latifah , Sagheer, Alaa , Alsultan, Rawan in 639/705/117 , 639/705/258 , Abstractive summarization

2025

With the exponential growth of digital content, Multi-Document Summarization (MDS) has become increasingly critical for synthesizing dispersed information into coherent and contextually relevant summaries. This paper presents PEGASUS-XL , an enhanced abstractive summarization framework that addresses key challenges in MDS, including salient content selection, redundancy reduction, factual consistency, and input length limitations. PEGASUS-XL is developed through a structured enhancement pipeline that integrates lexical-semantic saliency modeling with long-input encoding. It employs a hybrid scoring mechanism that combines TF-IDF and SBERT representations, modulated by a document-aware adaptive weighting scheme to dynamically balance lexical and semantic importance. To promote diversity and reduce redundancy, Maximal Marginal Relevance (MMR) is applied during content selection. To overcome the 1024-token limitation of standard Transformer models, Longformer is incorporated to enable efficient sparse attention over extended contexts. The vanilla PEGASUS model serves as the decoder and is fine-tuned on saliency-ranked, Longformer-encoded inputs to generate abstractive summaries. Extensive experiments on the Multi-News and XSum datasets demonstrate that PEGASUS-XL consistently outperforms strong baselines, including BART and PRIMERA, across multiple evaluation metrics (ROUGE, METEOR, BERTScore, and SBERT similarity). Ablation studies quantify the contribution of each component, and detailed error analysis identifies remaining issues such as factual drift and residual redundancy. Human evaluations further confirm that PEGASUS-XL produces summaries that are more coherent, informative, and faithful. Efficiency profiling shows that the framework achieves substantial quality gains without incurring disproportionate computational costs. Together, these contributions position PEGASUS-XL as a robust, scalable, and extensible solution for high-quality abstractive summarization in real-world multi-document scenarios.

Journal Article

Share this book

Add to My Shelf

A survey on the dataset, techniques, and evaluation metric used for abstractive text summarization

by Aggarwal, Gaurav , Sharma, Shivani , Rai, Bipin Kumar in Datasets , Deep learning , Language

2024

Whenever there is too much information out there, it is desirable to summarize. If humans are trying to create the summary, it will take lot of time. Now to make the problem of summarizing information easier and more effortless one can automate the summarization process which can reduce the time taken in creating summary. This is called as automatic summarization. The two ways of summarization are extractive summarization and abstractive summarization. Extractive summarization and its applications have been the subject of extensive research and have received state of art solution. But abstractive summarization still is a progressive field as it is difficult to create abstractive summary as humans do. Also, it is still a question i.e., how to evaluate the quality of a summary? Therefore, this paper is a comprehensive survey on the dataset used with its details and statistics, analysis of various abstractive summarization techniques and important parameters for evaluating the quality of summary. Deep leaning based models have given new direction in this field. The author also focuses on problems and challenges faced in the generation of summary which are opening the future research scope in this domain.

Journal Article

Share this book

Add to My Shelf

T5-Based Model for Abstractive Summarization: A Semi-Supervised Learning Approach with Consistency Loss Functions

by Du, Yao , Wang, Mingye , Hu, Xiaohui in abstractive summarization , Analysis , automatic text summarization

2023

Text summarization is a prominent task in natural language processing (NLP) that condenses lengthy texts into concise summaries. Despite the success of existing supervised models, they often rely on datasets of well-constructed text pairs, which can be insufficient for languages with limited annotated data, such as Chinese. To address this issue, we propose a semi-supervised learning method for text summarization. Our method is inspired by the cycle-consistent adversarial network (CycleGAN) and considers text summarization as a style transfer task. The model is trained by using a similar procedure and loss function to those of CycleGAN and learns to transfer the style of a document to its summary and vice versa. Our method can be applied to multiple languages, but this paper focuses on its performance on Chinese documents. We trained a T5-based model and evaluated it on two datasets, CSL and LCSTS, and the results demonstrate the effectiveness of the proposed method.

Journal Article

Share this book

Add to My Shelf

Legal lay summarization: exploring methods and data generation with large language models

by Moro, Gianluca , Ragazzi, Luca , Magnani, Leonardo David Matteo in Artificial Intelligence , Automatic summarization , Computer Science

2025

This paper explores advancements in Natural Language Processing (NLP) for legal lay summarization by systematically analyzing existing methodologies, datasets, and research findings. We review current literature, highlighting key challenges such as data scarcity and the complexity of legal language. A primary contribution of this study is the development of LegalEase, a specialized dataset designed to improve model training for summarizing legal documents in layman’s terms. Our findings demonstrate that subdomain-specific datasets within the legal domain outperform general legal datasets in enhancing NLP model performance for generating accurate and comprehensible legal summaries. The insights and methodologies presented provide a foundation for future research in legal lay summarization.

Journal Article

Share this book

Add to My Shelf

Brain-model neural similarity reveals abstractive summarization performance

by Zhou, Wenqing , Luo, Yingying , Zhu, Yingqi in 631/378/2649/1594 , 639/166/985 , 639/705/117

2025

Deep language models (DLMs) have exhibited remarkable language understanding and generation capabilities, prompting researchers to explore the similarities between their internal mechanisms and human language cognitive processing. This study investigated the representational similarity (RS) between the abstractive summarization (ABS) models and the human brain and its correlation to the performance of ABS tasks. Specifically, representational similarity analysis (RSA) was used to measure the similarity between the representational patterns (RPs) of the BART, PEGASUS, and T5 models’ hidden layers and the human brain’s language RPs under different spatiotemporal conditions. Layer-wise ablation manipulation, including attention ablation and noise addition was employed to examine the hidden layers’ effect on model performance. The results demonstrate that as the depth of hidden layers increases, the models’ text encoding becomes increasingly similar to the human brain’s language RPs. Manipulating deeper layers leads to more substantial decline in summarization performance compared to shallower layers, highlighting the crucial role of deeper layers in integrating essential information. Notably, the study confirms the hypothesis that the hidden layers exhibiting higher similarity to human brain activity play a more critical role in model performance, with their correlations reaching statistical significance even after controlling for perplexity. These findings deepen our understanding of the cognitive mechanisms underlying language representations in DLMs and their neural correlates, potentially providing insights for optimizing and improving language models by aligning them with the human brain’s language-processing mechanisms.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter