Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Extreme Multi-Label Text Classification for Less-Represented Languages and Low-Resource Environments: Advances and Lessons Learned
by
Koloski, Boshko
, Lavrač, Nada
, Purver, Matthew
, Škrlj, Blaž
, Pollak, Senja
, Ivačič, Nikola
in
Analysis
/ Artificial intelligence
/ Classification
/ Computational linguistics
/ Datasets
/ Labels
/ Language processing
/ Languages
/ less-represented languages
/ low-resource environments
/ media monitoring
/ Methods
/ multi-label text classification
/ multilingual text classification
/ Multilingualism
/ Natural language interfaces
/ Natural language processing
/ Retrieval
/ Text categorization
/ Text processing
2025
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Extreme Multi-Label Text Classification for Less-Represented Languages and Low-Resource Environments: Advances and Lessons Learned
by
Koloski, Boshko
, Lavrač, Nada
, Purver, Matthew
, Škrlj, Blaž
, Pollak, Senja
, Ivačič, Nikola
in
Analysis
/ Artificial intelligence
/ Classification
/ Computational linguistics
/ Datasets
/ Labels
/ Language processing
/ Languages
/ less-represented languages
/ low-resource environments
/ media monitoring
/ Methods
/ multi-label text classification
/ multilingual text classification
/ Multilingualism
/ Natural language interfaces
/ Natural language processing
/ Retrieval
/ Text categorization
/ Text processing
2025
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Extreme Multi-Label Text Classification for Less-Represented Languages and Low-Resource Environments: Advances and Lessons Learned
by
Koloski, Boshko
, Lavrač, Nada
, Purver, Matthew
, Škrlj, Blaž
, Pollak, Senja
, Ivačič, Nikola
in
Analysis
/ Artificial intelligence
/ Classification
/ Computational linguistics
/ Datasets
/ Labels
/ Language processing
/ Languages
/ less-represented languages
/ low-resource environments
/ media monitoring
/ Methods
/ multi-label text classification
/ multilingual text classification
/ Multilingualism
/ Natural language interfaces
/ Natural language processing
/ Retrieval
/ Text categorization
/ Text processing
2025
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Extreme Multi-Label Text Classification for Less-Represented Languages and Low-Resource Environments: Advances and Lessons Learned
Journal Article
Extreme Multi-Label Text Classification for Less-Represented Languages and Low-Resource Environments: Advances and Lessons Learned
2025
Request Book From Autostore
and Choose the Collection Method
Overview
Amid ongoing efforts to develop extremely large, multimodal models, there is increasing interest in efficient Small Language Models (SLMs) that can operate without reliance on large data-centre infrastructure. However, recent SLMs (e.g., LLaMA or Phi) with up to three billion parameters are predominantly trained in high-resource languages, such as English, which limits their applicability to industries that require robust NLP solutions for less-represented languages and low-resource settings, particularly those requiring low latency and adaptability to evolving label spaces. This paper examines a retrieval-based approach to multi-label text classification (MLC) for a media monitoring dataset, with a particular focus on less-represented languages, such as Slovene. This dataset presents an extreme MLC challenge, with instances labelled using up to twelve thousand categories. The proposed method, which combines retrieval with computationally efficient prediction, effectively addresses challenges related to multilinguality, resource constraints, and frequent label changes. We adopt a model-agnostic approach that does not rely on a specific model architecture or language selection. Our results demonstrate that techniques from the extreme multi-label text classification (XMC) domain outperform traditional Transformer-based encoder models, particularly in handling dynamic label spaces without requiring continuous fine-tuning. Additionally, we highlight the effectiveness of this approach in scenarios involving rare labels, where baseline models struggle with generalisation.
Publisher
MDPI AG
Subject
This website uses cookies to ensure you get the best experience on our website.