Asset Details

MbrlCatalogueTitleDetail

Do you wish to reserve the book?

MiTTenS: A Dataset for Evaluating Gender Mistranslation

by Robinson, Kevin , Kudugunta, Sneha , Stella, Romina , Sunipa Dev , Bastings, Jasmijn

in Datasets / Errors / Gloves / Languages / Machine translation / Systems analysis / Translating

2024

Yes Please

Hey, we have placed the reservation for you!

By the way, why not check out events that you can attend while you pick your title.

Oops! Something went wrong.

Looks like we were not able to place the reservation. Kindly try again later.

Paper

MiTTenS: A Dataset for Evaluating Gender Mistranslation

Robinson, Kevin,

Kudugunta, Sneha,

Stella, Romina,

Sunipa Dev,

Bastings, Jasmijn

2024

Overview

Translation systems, including foundation models capable of translation, can produce errors that result in gender mistranslation, and such errors can be especially harmful. To measure the extent of such potential harms when translating into and out of English, we introduce a dataset, MiTTenS, covering 26 languages from a variety of language families and scripts, including several traditionally under-represented in digital resources. The dataset is constructed with handcrafted passages that target known failure patterns, longer synthetically generated passages, and natural passages sourced from multiple domains. We demonstrate the usefulness of the dataset by evaluating both neural machine translation systems and foundation models, and show that all systems exhibit gender mistranslation and potential harm, even in high resource languages.

Share this book

Add to My Shelf

Publisher

Cornell University Library, arXiv.org

Subject

Datasets

/ Errors

/ Gloves

/ Languages

/ Machine translation

/ Systems analysis

/ Translating