Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for Wikipedia Categories
by
Te-Yu, Chi
, Yu-Meng, Tang
, Qiu-Xia, Zhang
, Jang, Jyh-Shing Roger
, Chia-Wen, Lu
in
Categories
/ Classification
/ Datasets
/ Encyclopedias
/ Labels
/ Text categorization
/ Training
2023
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for Wikipedia Categories
by
Te-Yu, Chi
, Yu-Meng, Tang
, Qiu-Xia, Zhang
, Jang, Jyh-Shing Roger
, Chia-Wen, Lu
in
Categories
/ Classification
/ Datasets
/ Encyclopedias
/ Labels
/ Text categorization
/ Training
2023
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for Wikipedia Categories
Paper
WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for Wikipedia Categories
2023
Request Book From Autostore
and Choose the Collection Method
Overview
Our research focuses on solving the zero-shot text classification problem in NLP, with a particular emphasis on innovative self-training strategies. To achieve this objective, we propose a novel self-training strategy that uses labels rather than text for training, significantly reducing the model's training time. Specifically, we use categories from Wikipedia as our training set and leverage the SBERT pre-trained model to establish positive correlations between pairs of categories within the same text, facilitating associative training. For new test datasets, we have improved the original self-training approach, eliminating the need for prior training and testing data from each target dataset. Instead, we adopt Wikipedia as a unified training dataset to better approximate the zero-shot scenario. This modification allows for rapid fine-tuning and inference across different datasets, greatly reducing the time required for self-training. Our experimental results demonstrate that this method can adapt the model to the target dataset within minutes. Compared to other BERT-based transformer models, our approach significantly reduces the amount of training data by training only on labels, not the actual text, and greatly improves training efficiency by utilizing a unified training set. Additionally, our method achieves state-of-the-art results on both the Yahoo Topic and AG News datasets.
Publisher
Cornell University Library, arXiv.org
Subject
This website uses cookies to ensure you get the best experience on our website.