Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Series TitleSeries Title
-
Reading LevelReading Level
-
YearFrom:-To:
-
More FiltersMore FiltersContent TypeItem TypeIs Full-Text AvailableSubjectCountry Of PublicationPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
52,040
result(s) for
"Speech processing"
Sort by:
Dragon NaturallySpeaking for dummies
Command your computer, surf the web, create reports, and more-- with your voice! Dragon NaturallySpeaking is a speech recognition program that lets users dictate into any Windows application, allowing you to access documents, write e-mails, and even update Facebook using only your voice.
A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research
by
Nakatani, Tomohiro
,
P. Habets, Emanuël A.
,
Haeb-Umbach, Reinhold
in
Engineering
,
Multichannel
,
Quantum Information Technology
2016
In recent years, substantial progress has been made in the field of reverberant speech signal processing, including both single- and multichannel dereverberation techniques and automatic speech recognition (ASR) techniques that are robust to reverberation. In this paper, we describe the REVERB challenge, which is an evaluation campaign that was designed to evaluate such speech enhancement (SE) and ASR techniques to reveal the state-of-the-art techniques and obtain new insights regarding potential future research directions. Even though most existing benchmark tasks and challenges for distant speech processing focus on the noise robustness issue and sometimes only on a single-channel scenario, a particular novelty of the REVERB challenge is that it is carefully designed to test robustness against
reverberation
, based on
both real, single-channel, and multichannel recordings
. This challenge attracted 27 papers, which represent 25 systems specifically designed for SE purposes and 49 systems specifically designed for ASR purposes. This paper describes the problems dealt within the challenge, provides an overview of the submitted systems, and scrutinizes them to clarify what current processing strategies appear effective in reverberant speech processing.
Journal Article
A review on speech processing using machine learning paradigm
2021
Speech processing plays a crucial role in many signal processing applications, while the last decade has bought gigantic evolution based on machine learning prototype. Speech processing has a close relationship with computer linguistics, human–machine interaction, natural language processing, and psycholinguistics. This review article majorly discusses the feature extraction techniques and machine learning classifiers employed in speech processing and recognition activities. The performance of several machine learning techniques is validated for speech emotion recognition application on Berlin EmoDB database. Further, it gives the broad application areas and challenges in machine learning for speech processing.
Journal Article
Survey of Deep Learning Paradigms for Speech Processing
by
Kothandaraman, Mohanaprasad
,
Bhangale, Kishor Barasu
in
Algorithms
,
Artificial neural networks
,
Audio equipment
2022
Over the past decades, a particular focus is given to research on machine learning techniques for speech processing applications. However, in the past few years, research has focused on using deep learning for speech processing applications. This new machine learning field has become a very attractive area of study and has remarkably better performance than the others in the various speech processing applications. This paper presents a brief survey of application deep learning for various speech processing applications such as speech separation, speech enhancement, speech recognition, speaker recognition, emotion recognition, language recognition, music recognition, speech data retrieval, etc. The survey goes on to cover the use of Auto-Encoder, Generative Adversarial Network, Restricted Boltzmann Machine, Deep Belief Network, Deep Neural Network, Convolutional Neural Network, Recurrent Neural Network and Deep Reinforcement Learning for speech processing. Additionally, it focuses on the various speech database and evaluation metrics used by deep learning algorithms for performance evaluation.
Journal Article
Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration
2020
Single-channel speech enhancement in highly non-stationary noise conditions is a very challenging task, especially when interfering speech is included in the noise. Deep learning-based approaches have notably improved the performance of speech enhancement algorithms under such conditions, but still introduce speech distortions if strong noise suppression shall be achieved. We propose to address this problem by using a two-stage approach, first performing noise suppression and subsequently restoring natural sounding speech, using specifically chosen neural network topologies and loss functions for each task. A mask-based long short-term memory (LSTM) network is employed for noise suppression and speech restoration is performed via spectral mapping with a convolutional encoder-decoder network (CED). The proposed method improves speech quality (PESQ) over state-of-the-art single-stage methods by about 0.1 points for unseen highly non-stationary noise types including interfering speech. Furthermore, it is able to increase intelligibility in low-SNR conditions and consistently outperforms all reference methods.
Journal Article
Robotics, Vision and Control : Fundamental Algorithms In MATLAB® Second, Completely Revised, Extended And Updated Edition
Robotic vision, the combination of robotics and computer vision, involves the application of computer algorithms to data acquired from sensors. The research community has developed a large body of such algorithms but for a newcomer to the field this can be quite daunting. For over 20 years the author has maintained two open-source MATLAB® Toolboxes, one for robotics and one for vision. They provide implementations of many important algorithms and allow users to work with real problems, not just trivial examples. This book makes the fundamental algorithms of robotics, vision and control accessible to all. It weaves together theory, algorithms and examples in a narrative that covers robotics and computer vision separately and together. Using the latest versions of the Toolboxes the author shows how complex problems can be decomposed and solved using just a few simple lines of code. The topics covered are guided by real problems observed by the author over many years as a practitioner of both robotics and computer vision. It is written in an accessible but informative style, easy to read and absorb, and includes over 1000 MATLAB and Simulink® examples and over 400 figures. The book is a real walk through the fundamentals of mobile robots, arm robots. then camera models, image processing, feature extraction and multi-view geometry and finally bringing it all together with an extensive discussion of visual servo systems. This second edition is completely revised, updated and extended with coverage of Lie groups, matrix exponentials and twists; inertial navigation; differential drive robots; lattice planners; pose-graph SLAM and map making; restructured material on arm-robot kinematics and dynamics; series-elastic actuators and operational-space control; Lab color spaces; light field cameras; structured light, bundle adjustment and visual odometry; and photometric visual servoing. \"An authoritative book, reaching across fields, thoughtfully conceived and brilliantly accomplished!\" OUSSAMA KHATIB, Stanford.
Fundamentals, present and future perspectives of speech enhancement
by
Das, Nabanita
,
Chakraborty, Sayan
,
Chaki, Jyotismita
in
Background noise
,
Biometrics
,
Classification
2021
Speech enhancement has substantial interest in the utilization of speaker identification, video-conference, speech transmission through communication channels, speech-based biometric system, mobile phones, hearing aids, microphones, voice conversion etc. Pattern mining methods have a vital step in the growth of speech enhancement schemes. To design a successful speech enhancement system consideration to the background noise processing is needed. A substantial number of methods from traditional techniques and machine learning have been utilized to process and remove the additive noise from a speech signal. With the advancement of machine learning and deep learning, classification of speech has become more significant. Methods of speech enhancement consist of different stages, such as feature extraction of the input speech signal, feature selection, feature selection followed by classification. Deep learning techniques are also an emerging field in the classification domain, which is discussed in this review. The intention of this paper is to provide a state-of-the-art summary and present approaches for using the widely used machine learning and deep learning methods to detect the challenges along with future research directions of speech enhancement systems.
Journal Article