Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Improving Text Classification with Large Language Model-Based Data Augmentation
by
Zhao, Huanhuan
, Ruggles, Thomas A.
, Singh, Debjani
, Yoon, Hong-Jun
, Chen, Haihua
, Feng, Yunhe
in
Artificial intelligence
/ Chatbots
/ ChatGPT
/ Classification
/ Computational linguistics
/ Data augmentation
/ Datasets
/ Effectiveness
/ Electronic health records
/ Hydroelectric power
/ imbalanced data
/ Language
/ Language processing
/ large language model
/ Large language models
/ Licenses
/ Licensing
/ Machine learning
/ MATHEMATICS AND COMPUTING
/ Natural language
/ Natural language interfaces
/ natural language processing
/ Performance evaluation
/ Text categorization
/ text classification
2024
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Improving Text Classification with Large Language Model-Based Data Augmentation
by
Zhao, Huanhuan
, Ruggles, Thomas A.
, Singh, Debjani
, Yoon, Hong-Jun
, Chen, Haihua
, Feng, Yunhe
in
Artificial intelligence
/ Chatbots
/ ChatGPT
/ Classification
/ Computational linguistics
/ Data augmentation
/ Datasets
/ Effectiveness
/ Electronic health records
/ Hydroelectric power
/ imbalanced data
/ Language
/ Language processing
/ large language model
/ Large language models
/ Licenses
/ Licensing
/ Machine learning
/ MATHEMATICS AND COMPUTING
/ Natural language
/ Natural language interfaces
/ natural language processing
/ Performance evaluation
/ Text categorization
/ text classification
2024
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Improving Text Classification with Large Language Model-Based Data Augmentation
by
Zhao, Huanhuan
, Ruggles, Thomas A.
, Singh, Debjani
, Yoon, Hong-Jun
, Chen, Haihua
, Feng, Yunhe
in
Artificial intelligence
/ Chatbots
/ ChatGPT
/ Classification
/ Computational linguistics
/ Data augmentation
/ Datasets
/ Effectiveness
/ Electronic health records
/ Hydroelectric power
/ imbalanced data
/ Language
/ Language processing
/ large language model
/ Large language models
/ Licenses
/ Licensing
/ Machine learning
/ MATHEMATICS AND COMPUTING
/ Natural language
/ Natural language interfaces
/ natural language processing
/ Performance evaluation
/ Text categorization
/ text classification
2024
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Improving Text Classification with Large Language Model-Based Data Augmentation
Journal Article
Improving Text Classification with Large Language Model-Based Data Augmentation
2024
Request Book From Autostore
and Choose the Collection Method
Overview
Large Language Models (LLMs) such as ChatGPT possess advanced capabilities in understanding and generating text. These capabilities enable ChatGPT to create text based on specific instructions, which can serve as augmented data for text classification tasks. Previous studies have approached data augmentation (DA) by either rewriting the existing dataset with ChatGPT or generating entirely new data from scratch. However, it is unclear which method is better without comparing their effectiveness. This study investigates the application of both methods to two datasets: a general-topic dataset (Reuters news data) and a domain-specific dataset (Mitigation dataset). Our findings indicate that: 1. ChatGPT generated new data consistently enhanced model’s classification results for both datasets. 2. Generating new data generally outperforms rewriting existing data, though crafting the prompts carefully is crucial to extract the most valuable information from ChatGPT, particularly for domain-specific data. 3. The augmentation data size affects the effectiveness of DA; however, we observed a plateau after incorporating 10 samples. 4. Combining the rewritten sample with new generated sample can potentially further improve the model’s performance.
This website uses cookies to ensure you get the best experience on our website.