Asset Details

MbrlCatalogueTitleDetail

Paper

Compressing Language Models for Specialized Domains

Williams, Miles,

Jeronymo, Vitor,

Chrysostomou, George,

Aletras, Nikolaos

2026

Overview

Language models (LMs) excel at tasks across diverse domains, yet require substantial computational resources during inference. Compression techniques such as pruning and quantization offer a practical path towards efficient LM deployment, exemplified by their ability to preserve performance on general-purpose benchmarks. However, general-purpose LM compression methods can negatively affect performance in specialized domains (e.g. biomedical or legal). Recent work has sought to address this issue, but requires a computationally expensive full-parameter fine-tuning pipeline. To this end, we propose MixCal, a novel calibration method designed to improve the in-domain performance of compressed LMs in a post-training setting. Through extensive experimentation, we demonstrate that MixCal substantially outperforms existing approaches on domain-specific tasks and preserves general performance. Notably, these performance gains are achieved while also reducing the computational cost of LM compression.

Share this book

Add to My Shelf

Publisher

Cornell University Library, arXiv.org

Subject

Compressing

/ Computing costs