Catalogue Search | MBRL

Air traffic control communication (ATCC) speech corpora and their use for ASR and TTS development

by Matoušek, Jindřich , Šmídl, Luboš , Tihelka, Daniel in Acknowledgment , Air traffic control , Automatic speech recognition

2019

The paper introduces the motivation for creating dedicated speech corpora of air traffic control communication, describes in detail the process of preparation of corpora for both automatic speech recognition and text-to-speech synthesis, presents an illustrative example of speech recognition system developed using the automatic speech recognition corpora and finally describes the technical aspects of the data and the distribution channel.

Journal Article

Share this book

Add to My Shelf

General framework for mining, processing and storing large amounts of electronic texts for language modeling purposes

by Lehečka, Jan , Pražák, Aleš , Stanislav, Petr in Acoustic data , Algorithms , Applied linguistics

2014

The paper describes a general framework for mining large amounts of text data from a defined set of Web pages. The acquired data are meant to constitute a corpus for training robust and reliable language models and thus the framework needs to also incorporate algorithms for appropriate text processing and duplicity detection in order to secure quality and consistency of the data. As we expect the resulting corpus to be very large, we have also implemented topic detection algorithms that allow us to automatically select subcorpora for domain-specific language models. The description of the framework architecture and the implemented algorithms is complemented with a detailed evaluation section. It analyses the basic properties of the gathered Czech corpus containing more than one billion text tokens collected using the described framework, shows the results of the topic detection methods and finally also describes the design and outcomes of the automatic speech recognition experiments with domain-specific language models estimated from the collected data.

Journal Article

Share this book

Add to My Shelf

A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archives

by Lehečka, Jan , Psutka, Josef V , Šmídl, Luboš in Archives & records , Automatic speech recognition , Cultural resources

2024

In this paper, we are comparing monolingual Wav2Vec 2.0 models with various multilingual models to see whether we could improve speech recognition performance on a unique oral history archive containing a lot of mixed-language sentences. Our main goal is to push forward research on this unique dataset, which is an extremely valuable part of our cultural heritage. Our results suggest that monolingual speech recognition models are, in most cases, superior to multilingual models, even when processing the oral history archive full of mixed-language sentences from non-native speakers. We also performed the same experiments on the public CommonVoice dataset to verify our results. We are contributing to the research community by releasing our pre-trained models to the public.

Paper

Share this book

Add to My Shelf

System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive

by Vaněk, Jan , Šmídl, Luboš , Pražák, Aleš in Acoustics , Algorithms , Archives

2011

The main objective of the work presented in this paper was to develop a complete system that would accomplish the original visions of the MALACH project. Those goals were to employ automatic speech recognition and information retrieval techniques to provide improved access to the large video archive containing recorded testimonies of the Holocaust survivors. The system has been so far developed for the Czech part of the archive only. It takes advantage of the state-of-the-art speech recognition system tailored to the challenging properties of the recordings in the archive (elderly speakers, spontaneous speech and emotionally loaded content) and its close coupling with the actual search engine. The design of the algorithm adopting the spoken term detection approach is focused on the speed of the retrieval. The resulting system is able to search through the 1,000 h of video constituting the Czech portion of the archive and find query word occurrences in the matter of seconds. The phonetic search implemented alongside the search based on the lexicon words allows to find even the words outside the ASR system lexicon such as names, geographic locations or Jewish slang.

Journal Article

Share this book

Add to My Shelf

Speech Technology Services for Oral History Research

by Lehečka, Jan , Draxler, Christoph , Arjan van Hessen in Speech processing

2024

Oral history is about oral sources of witnesses and commentors on historical events. Speech technology is an important instrument to process such recordings in order to obtain transcription and further enhancements to structure the oral account In this contribution we address the transcription portal and the webservices associated with speech processing at BAS, speech solutions developed at LINDAT, how to do it yourself with Whisper, remaining challenges, and future developments.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter