Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
ARMADA: Attribute-Based Multimodal Data Augmentation
by
Te-Lin, Wu
, Zhou, Yu
, Ji, Heng
, Jin, Xiaomeng
, Kim, Jeonghwan
, Peng, Nanyun
, Kuan-Hao, Huang
in
Data augmentation
/ Image enhancement
/ Image manipulation
/ Image quality
/ Knowledge
/ Knowledge bases (artificial intelligence)
/ Knowledge representation
/ Large language models
/ Visual tasks
2024
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
ARMADA: Attribute-Based Multimodal Data Augmentation
by
Te-Lin, Wu
, Zhou, Yu
, Ji, Heng
, Jin, Xiaomeng
, Kim, Jeonghwan
, Peng, Nanyun
, Kuan-Hao, Huang
in
Data augmentation
/ Image enhancement
/ Image manipulation
/ Image quality
/ Knowledge
/ Knowledge bases (artificial intelligence)
/ Knowledge representation
/ Large language models
/ Visual tasks
2024
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
ARMADA: Attribute-Based Multimodal Data Augmentation
by
Te-Lin, Wu
, Zhou, Yu
, Ji, Heng
, Jin, Xiaomeng
, Kim, Jeonghwan
, Peng, Nanyun
, Kuan-Hao, Huang
in
Data augmentation
/ Image enhancement
/ Image manipulation
/ Image quality
/ Knowledge
/ Knowledge bases (artificial intelligence)
/ Knowledge representation
/ Large language models
/ Visual tasks
2024
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Paper
ARMADA: Attribute-Based Multimodal Data Augmentation
2024
Request Book From Autostore
and Choose the Collection Method
Overview
In Multimodal Language Models (MLMs), the cost of manually annotating high-quality image-text pair data for fine-tuning and alignment is extremely high. While existing multimodal data augmentation frameworks propose ways to augment image-text pairs, they either suffer from semantic inconsistency between texts and images, or generate unrealistic images, causing knowledge gap with real world examples. To address these issues, we propose Attribute-based Multimodal Data Augmentation (ARMADA), a novel multimodal data augmentation method via knowledge-guided manipulation of visual attributes of the mentioned entities. Specifically, we extract entities and their visual attributes from the original text data, then search for alternative values for the visual attributes under the guidance of knowledge bases (KBs) and large language models (LLMs). We then utilize an image-editing model to edit the images with the extracted attributes. ARMADA is a novel multimodal data generation framework that: (i) extracts knowledge-grounded attributes from symbolic KBs for semantically consistent yet distinctive image-text pair generation, (ii) generates visually similar images of disparate categories using neighboring entities in the KB hierarchy, and (iii) uses the commonsense knowledge of LLMs to modulate auxiliary visual attributes such as backgrounds for more robust representation of original entities. Our empirical results over four downstream tasks demonstrate the efficacy of our framework to produce high-quality data and enhance the model performance. This also highlights the need to leverage external knowledge proxies for enhanced interpretability and real-world grounding.
Publisher
Cornell University Library, arXiv.org
This website uses cookies to ensure you get the best experience on our website.