Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
ParsiNorm: A Persian Toolkit for Speech Processing Normalization
by
Sajjad Abdi Dehsorkh
, Asheri, Hadi
, Razavi, Seyedeh Fatemeh
, Oji, Romina
, Hosseini, Reshad
, Hariri, Alireza
in
Currencies
/ Format
/ Language
/ Modules
/ Source code
/ Speech
/ Speech processing
/ Toolkits
2021
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
ParsiNorm: A Persian Toolkit for Speech Processing Normalization
by
Sajjad Abdi Dehsorkh
, Asheri, Hadi
, Razavi, Seyedeh Fatemeh
, Oji, Romina
, Hosseini, Reshad
, Hariri, Alireza
in
Currencies
/ Format
/ Language
/ Modules
/ Source code
/ Speech
/ Speech processing
/ Toolkits
2021
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
ParsiNorm: A Persian Toolkit for Speech Processing Normalization
Paper
ParsiNorm: A Persian Toolkit for Speech Processing Normalization
2021
Request Book From Autostore
and Choose the Collection Method
Overview
In general, speech processing models consist of a language model along with an acoustic model. Regardless of the language model's complexity and variants, three critical pre-processing steps are needed in language models: cleaning, normalization, and tokenization. Among mentioned steps, the normalization step is so essential to format unification in pure textual applications. However, for embedded language models in speech processing modules, normalization is not limited to format unification. Moreover, it has to convert each readable symbol, number, etc., to how they are pronounced. To the best of our knowledge, there is no Persian normalization toolkits for embedded language models in speech processing modules, So in this paper, we propose an open-source normalization toolkit for text processing in speech applications. Briefly, we consider different readable Persian text like symbols (common currencies, #, @, URL, etc.), numbers (date, time, phone number, national code, etc.), and so on. Comparison with other available Persian textual normalization tools indicates the superiority of the proposed method in speech processing. Also, comparing the model's performance for one of the proposed functions (sentence separation) with other common natural language libraries such as HAZM and Parsivar indicates the proper performance of the proposed method. Besides, its evaluation of some Persian Wikipedia data confirms the proper performance of the proposed method.
Publisher
Cornell University Library, arXiv.org
Subject
This website uses cookies to ensure you get the best experience on our website.