Catalogue Search | MBRL

Dragon NaturallySpeaking for dummies

by Diamond, Stephanie, author in Dragon NaturallySpeaking. , Speech processing systems. , Speech processing systems Computer programs.

Command your computer, surf the web, create reports, and more-- with your voice! Dragon NaturallySpeaking is a speech recognition program that lets users dictate into any Windows application, allowing you to access documents, write e-mails, and even update Facebook using only your voice.

Book

Share this book

Add to My Shelf

A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research

by Nakatani, Tomohiro , P. Habets, Emanuël A. , Haeb-Umbach, Reinhold in Engineering , Multichannel , Quantum Information Technology

2016

In recent years, substantial progress has been made in the field of reverberant speech signal processing, including both single- and multichannel dereverberation techniques and automatic speech recognition (ASR) techniques that are robust to reverberation. In this paper, we describe the REVERB challenge, which is an evaluation campaign that was designed to evaluate such speech enhancement (SE) and ASR techniques to reveal the state-of-the-art techniques and obtain new insights regarding potential future research directions. Even though most existing benchmark tasks and challenges for distant speech processing focus on the noise robustness issue and sometimes only on a single-channel scenario, a particular novelty of the REVERB challenge is that it is carefully designed to test robustness against reverberation , based on both real, single-channel, and multichannel recordings . This challenge attracted 27 papers, which represent 25 systems specifically designed for SE purposes and 49 systems specifically designed for ASR purposes. This paper describes the problems dealt within the challenge, provides an overview of the submitted systems, and scrutinizes them to clarify what current processing strategies appear effective in reverberant speech processing.

Journal Article

Share this book

Add to My Shelf

Audio source separation and speech enhancement

by Vincent, Emmanuel (Research scientist), editor , Virtanen, Tuomas, editor , Gannot, Sharon, editor in Speech processing systems. , Automatic speech recognition.

Book

Share this book

Add to My Shelf

A review on speech processing using machine learning paradigm

by Bhangale, Kishor Barasu , Mohanaprasad, K in Acknowledgment , Acoustics , Application

2021

Speech processing plays a crucial role in many signal processing applications, while the last decade has bought gigantic evolution based on machine learning prototype. Speech processing has a close relationship with computer linguistics, human–machine interaction, natural language processing, and psycholinguistics. This review article majorly discusses the feature extraction techniques and machine learning classifiers employed in speech processing and recognition activities. The performance of several machine learning techniques is validated for speech emotion recognition application on Berlin EmoDB database. Further, it gives the broad application areas and challenges in machine learning for speech processing.

Journal Article

Share this book

Add to My Shelf

Audio processing and speech recognition : concepts, techniques and research overviews

by Sen, Soumya, 1982- author , Dutta, Anjan, author , Dey, Nilanjan, 1984- author in Natural language processing (Computer science) , Automatic speech recognition. , Computer sound processing.

Book

Share this book

Add to My Shelf

Survey of Deep Learning Paradigms for Speech Processing

by Kothandaraman, Mohanaprasad , Bhangale, Kishor Barasu in Algorithms , Artificial neural networks , Audio equipment

2022

Over the past decades, a particular focus is given to research on machine learning techniques for speech processing applications. However, in the past few years, research has focused on using deep learning for speech processing applications. This new machine learning field has become a very attractive area of study and has remarkably better performance than the others in the various speech processing applications. This paper presents a brief survey of application deep learning for various speech processing applications such as speech separation, speech enhancement, speech recognition, speaker recognition, emotion recognition, language recognition, music recognition, speech data retrieval, etc. The survey goes on to cover the use of Auto-Encoder, Generative Adversarial Network, Restricted Boltzmann Machine, Deep Belief Network, Deep Neural Network, Convolutional Neural Network, Recurrent Neural Network and Deep Reinforcement Learning for speech processing. Additionally, it focuses on the various speech database and evaluation metrics used by deep learning algorithms for performance evaluation.

Journal Article

Share this book

Add to My Shelf

Voice user interface design : moving from GUI to mixed modal interaction

by Dasgupta, Ritwik, author in Automatic speech recognition. , Speech processing systems. , Human-computer interaction.

Book

Share this book

Add to My Shelf

Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration

by Fingscheidt Tim , Defraene Bruno , Tirry Wouter in Algorithms , Coders , Encoders-Decoders

2020

Single-channel speech enhancement in highly non-stationary noise conditions is a very challenging task, especially when interfering speech is included in the noise. Deep learning-based approaches have notably improved the performance of speech enhancement algorithms under such conditions, but still introduce speech distortions if strong noise suppression shall be achieved. We propose to address this problem by using a two-stage approach, first performing noise suppression and subsequently restoring natural sounding speech, using specifically chosen neural network topologies and loss functions for each task. A mask-based long short-term memory (LSTM) network is employed for noise suppression and speech restoration is performed via spectral mapping with a convolutional encoder-decoder network (CED). The proposed method improves speech quality (PESQ) over state-of-the-art single-stage methods by about 0.1 points for unseen highly non-stationary noise types including interfering speech. Furthermore, it is able to increase intelligibility in low-SNR conditions and consistently outperforms all reference methods.

Journal Article

Share this book

Add to My Shelf

Robotics, Vision and Control : Fundamental Algorithms In MATLAB® Second, Completely Revised, Extended And Updated Edition

by Corke, Peter. author in Artificial intelligence. , Automation. , Cognitive psychology.

Robotic vision, the combination of robotics and computer vision, involves the application of computer algorithms to data acquired from sensors. The research community has developed a large body of such algorithms but for a newcomer to the field this can be quite daunting. For over 20 years the author has maintained two open-source MATLAB® Toolboxes, one for robotics and one for vision. They provide implementations of many important algorithms and allow users to work with real problems, not just trivial examples. This book makes the fundamental algorithms of robotics, vision and control accessible to all. It weaves together theory, algorithms and examples in a narrative that covers robotics and computer vision separately and together. Using the latest versions of the Toolboxes the author shows how complex problems can be decomposed and solved using just a few simple lines of code. The topics covered are guided by real problems observed by the author over many years as a practitioner of both robotics and computer vision. It is written in an accessible but informative style, easy to read and absorb, and includes over 1000 MATLAB and Simulink® examples and over 400 figures. The book is a real walk through the fundamentals of mobile robots, arm robots. then camera models, image processing, feature extraction and multi-view geometry and finally bringing it all together with an extensive discussion of visual servo systems. This second edition is completely revised, updated and extended with coverage of Lie groups, matrix exponentials and twists; inertial navigation; differential drive robots; lattice planners; pose-graph SLAM and map making; restructured material on arm-robot kinematics and dynamics; series-elastic actuators and operational-space control; Lab color spaces; light field cameras; structured light, bundle adjustment and visual odometry; and photometric visual servoing. \"An authoritative book, reaching across fields, thoughtfully conceived and brilliantly accomplished!\" OUSSAMA KHATIB, Stanford.

Book

Share this book

Add to My Shelf

Fundamentals, present and future perspectives of speech enhancement

by Das, Nabanita , Chakraborty, Sayan , Chaki, Jyotismita in Background noise , Biometrics , Classification

2021

Speech enhancement has substantial interest in the utilization of speaker identification, video-conference, speech transmission through communication channels, speech-based biometric system, mobile phones, hearing aids, microphones, voice conversion etc. Pattern mining methods have a vital step in the growth of speech enhancement schemes. To design a successful speech enhancement system consideration to the background noise processing is needed. A substantial number of methods from traditional techniques and machine learning have been utilized to process and remove the additive noise from a speech signal. With the advancement of machine learning and deep learning, classification of speech has become more significant. Methods of speech enhancement consist of different stages, such as feature extraction of the input speech signal, feature selection, feature selection followed by classification. Deep learning techniques are also an emerging field in the classification domain, which is discussed in this review. The intention of this paper is to provide a state-of-the-art summary and present approaches for using the widely used machine learning and deep learning methods to detect the challenges along with future research directions of speech enhancement systems.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter