Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Text-Free Prosody-Aware Generative Spoken Language Modeling
by
Kharitonov, Eugene
, Lee, Ann
, Dupoux, Emmanuel
, Rivière, Morgane
, Wei-Ning, Hsu
, Copet, Jade
, Tu-Anh Nguyen
, Abdelrahman, Mohamed
, Polyak, Adam
, Lakhotia, Kushal
, Adi, Yossi
in
Language
/ Linguistics
/ Modelling
/ Sentences
/ Speech
/ Training
/ Waveforms
2022
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Text-Free Prosody-Aware Generative Spoken Language Modeling
by
Kharitonov, Eugene
, Lee, Ann
, Dupoux, Emmanuel
, Rivière, Morgane
, Wei-Ning, Hsu
, Copet, Jade
, Tu-Anh Nguyen
, Abdelrahman, Mohamed
, Polyak, Adam
, Lakhotia, Kushal
, Adi, Yossi
in
Language
/ Linguistics
/ Modelling
/ Sentences
/ Speech
/ Training
/ Waveforms
2022
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Text-Free Prosody-Aware Generative Spoken Language Modeling
by
Kharitonov, Eugene
, Lee, Ann
, Dupoux, Emmanuel
, Rivière, Morgane
, Wei-Ning, Hsu
, Copet, Jade
, Tu-Anh Nguyen
, Abdelrahman, Mohamed
, Polyak, Adam
, Lakhotia, Kushal
, Adi, Yossi
in
Language
/ Linguistics
/ Modelling
/ Sentences
/ Speech
/ Training
/ Waveforms
2022
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Text-Free Prosody-Aware Generative Spoken Language Modeling
Paper
Text-Free Prosody-Aware Generative Spoken Language Modeling
2022
Request Book From Autostore
and Choose the Collection Method
Overview
Speech pre-training has primarily demonstrated efficacy on classification tasks, while its capability of generating novel speech, similar to how GPT-2 can generate coherent paragraphs, has barely been explored. Generative Spoken Language Modeling (GSLM) \\cite{Lakhotia2021} is the only prior work addressing the generative aspects of speech pre-training, which replaces text with discovered phone-like units for language modeling and shows the ability to generate meaningful novel sentences. Unfortunately, despite eliminating the need of text, the units used in GSLM discard most of the prosodic information. Hence, GSLM fails to leverage prosody for better comprehension, and does not generate expressive speech. In this work, we present a prosody-aware generative spoken language model (pGSLM). It is composed of a multi-stream transformer language model (MS-TLM) of speech, represented as discovered unit and prosodic feature streams, and an adapted HiFi-GAN model converting MS-TLM outputs to waveforms. We devise a series of metrics for prosody modeling and generation, and re-use metrics from GSLM for content modeling. Experimental results show that the pGSLM can utilize prosody to improve both prosody and content modeling, and also generate natural, meaningful, and coherent speech given a spoken prompt. Audio samples can be found at https://speechbot.github.io/pgslm. Codes and models are available at https://github.com/pytorch/fairseq/tree/main/examples/textless_nlp/pgslm.
MBRLCatalogueRelatedBooks
Related Items
Related Items
This website uses cookies to ensure you get the best experience on our website.