Asset Details

MbrlCatalogueTitleDetail

Do you wish to reserve the book?

A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression

by Zhang, Zhisong , Dou, Zhicheng , Mao, Kelong , Deng, Chenlong , Huang, Xinting , Yu, Dong , Li, Shuaiyi

in Attention / Context / Large language models

2024

Yes Please

Hey, we have placed the reservation for you!

By the way, why not check out events that you can attend while you pick your title.

Oops! Something went wrong.

Looks like we were not able to place the reservation. Kindly try again later.

Paper

A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression

Zhang, Zhisong,

Dou, Zhicheng,

Mao, Kelong,

Deng, Chenlong,

Huang, Xinting,

Yu, Dong,

Li, Shuaiyi

2024

Overview

In this work, we provide a thorough investigation of gist-based context compression methods to improve long-context processing in large language models. We focus on two key questions: (1) How well can these methods replace full attention models? and (2) What potential failure patterns arise due to compression? Through extensive experiments, we show that while gist-based compression can achieve near-lossless performance on tasks like retrieval-augmented generation and long-document QA, it faces challenges in tasks like synthetic recall. Furthermore, we identify three key failure patterns: lost by the boundary, lost if surprise, and lost along the way. To mitigate these issues, we propose two effective strategies: fine-grained autoencoding, which enhances the reconstruction of original token information, and segment-wise token importance estimation, which adjusts optimization based on token dependencies. Our work provides valuable insights into the understanding of gist token-based context compression and offers practical strategies for improving compression capabilities.

Share this book

Add to My Shelf

Publisher

Cornell University Library, arXiv.org

Subject

Attention

/ Context

/ Large language models