Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Speech Recognition and Synthesis Models and Platforms for the Kazakh Language
by
Abduali, Balzhan
, Amirova, Dina
, Karibayeva, Aidana
, Karyukin, Vladislav
in
Accuracy
/ Acknowledgment
/ Acoustics
/ Adaptation
/ Algorithms
/ Artificial intelligence
/ ASR
/ Automatic speech recognition
/ Datasets
/ Intelligibility
/ Kazakh language
/ Language
/ Language modeling
/ Language shift
/ Languages
/ Lexical semantics
/ Machine learning
/ Morphology
/ Phonemics
/ Phonetics
/ Preservation
/ Semantics
/ Speech
/ Speech perception
/ Speech recognition
/ Speech recognition software
/ Speech synthesis
/ STT
/ Text-to-speech
/ Transformation
/ TTS
/ Turkic languages
/ Voice recognition
/ Vowels
2025
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Speech Recognition and Synthesis Models and Platforms for the Kazakh Language
by
Abduali, Balzhan
, Amirova, Dina
, Karibayeva, Aidana
, Karyukin, Vladislav
in
Accuracy
/ Acknowledgment
/ Acoustics
/ Adaptation
/ Algorithms
/ Artificial intelligence
/ ASR
/ Automatic speech recognition
/ Datasets
/ Intelligibility
/ Kazakh language
/ Language
/ Language modeling
/ Language shift
/ Languages
/ Lexical semantics
/ Machine learning
/ Morphology
/ Phonemics
/ Phonetics
/ Preservation
/ Semantics
/ Speech
/ Speech perception
/ Speech recognition
/ Speech recognition software
/ Speech synthesis
/ STT
/ Text-to-speech
/ Transformation
/ TTS
/ Turkic languages
/ Voice recognition
/ Vowels
2025
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Speech Recognition and Synthesis Models and Platforms for the Kazakh Language
by
Abduali, Balzhan
, Amirova, Dina
, Karibayeva, Aidana
, Karyukin, Vladislav
in
Accuracy
/ Acknowledgment
/ Acoustics
/ Adaptation
/ Algorithms
/ Artificial intelligence
/ ASR
/ Automatic speech recognition
/ Datasets
/ Intelligibility
/ Kazakh language
/ Language
/ Language modeling
/ Language shift
/ Languages
/ Lexical semantics
/ Machine learning
/ Morphology
/ Phonemics
/ Phonetics
/ Preservation
/ Semantics
/ Speech
/ Speech perception
/ Speech recognition
/ Speech recognition software
/ Speech synthesis
/ STT
/ Text-to-speech
/ Transformation
/ TTS
/ Turkic languages
/ Voice recognition
/ Vowels
2025
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Speech Recognition and Synthesis Models and Platforms for the Kazakh Language
Journal Article
Speech Recognition and Synthesis Models and Platforms for the Kazakh Language
2025
Request Book From Autostore
and Choose the Collection Method
Overview
With the rapid development of artificial intelligence and machine learning technologies, automatic speech recognition (ASR) and text-to-speech (TTS) have become key components of the digital transformation of society. The Kazakh language, as a representative of the Turkic language family, remains a low-resource language with limited audio corpora, language models, and high-quality speech synthesis systems. This study provides a comprehensive analysis of existing speech recognition and synthesis models, emphasizing their applicability and adaptation to the Kazakh language. Special attention is given to linguistic and technical barriers, including the agglutinative structure, rich vowel system, and phonemic variability. Both open-source and commercial solutions were evaluated, including Whisper, GPT-4 Transcribe, ElevenLabs, OpenAI TTS, Voiser, KazakhTTS2, and TurkicTTS. Speech recognition systems were assessed using BLEU, WER, TER, chrF, and COMET, while speech synthesis was evaluated with MCD, PESQ, STOI, and DNSMOS, thus covering both lexical–semantic and acoustic–perceptual characteristics. The results demonstrate that, for speech-to-text (STT), the strongest performance was achieved by Soyle on domain-specific data (BLEU 74.93, WER 18.61), while Voiser showed balanced accuracy (WER 40.65–37.11, chrF 80.88–84.51) and GPT-4 Transcribe achieved robust semantic preservation (COMET up to 1.02). In contrast, Whisper performed weakest (WER 77.10, BLEU 13.22), requiring further adaptation for Kazakh. For text-to-speech (TTS), KazakhTTS2 delivered the most natural perceptual quality (DNSMOS 8.79–8.96), while OpenAI TTS achieved the best spectral accuracy (MCD 123.44–117.11, PESQ 1.14). TurkicTTS offered reliable intelligibility (STOI 0.15, PESQ 1.16), and ElevenLabs produced natural but less spectrally accurate speech.
MBRLCatalogueRelatedBooks
This website uses cookies to ensure you get the best experience on our website.