Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Don't Touch My Diacritics
by
Gorman, Kyle
, Pinter, Yuval
in
Natural language processing
2024
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Do you wish to request the book?
Don't Touch My Diacritics
by
Gorman, Kyle
, Pinter, Yuval
in
Natural language processing
2024
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Paper
Don't Touch My Diacritics
2024
Request Book From Autostore
and Choose the Collection Method
Overview
The common practice of preprocessing text before feeding it into NLP models introduces many decision points which have unintended consequences on model performance. In this opinion piece, we focus on the handling of diacritics in texts originating in many languages and scripts. We demonstrate, through several case studies, the adverse effects of inconsistent encoding of diacritized characters and of removing diacritics altogether. We call on the community to adopt simple but necessary steps across all models and toolkits in order to improve handling of diacritized text and, by extension, increase equity in multilingual NLP.
Publisher
Cornell University Library, arXiv.org
Subject
This website uses cookies to ensure you get the best experience on our website.