Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
4,300
result(s) for
"Scene analysis"
Sort by:
Is predictability salient? A study of attentional capture by auditory patterns
2017
In this series of behavioural and electroencephalography (EEG) experiments, we investigate the extent to which repeating patterns of sounds capture attention. Work in the visual domain has revealed attentional capture by statistically predictable stimuli, consistent with predictive coding accounts which suggest that attention is drawn to sensory regularities. Here, stimuli comprised rapid sequences of tone pips, arranged in regular (REG) or random (RAND) patterns. EEG data demonstrate that the brain rapidly recognizes predictable patterns manifested as a rapid increase in responses to REG relative to RAND sequences. This increase is reminiscent of the increase in gain on neural responses to attended stimuli often seen in the neuroimaging literature, and thus consistent with the hypothesis that predictable sequences draw attention. To study potential attentional capture by auditory regularities, we used REG and RAND sequences in two different behavioural tasks designed to reveal effects of attentional capture by regularity. Overall, the pattern of results suggests that regularity does not capture attention.
This article is part of the themed issue ‘Auditory and visual scene analysis’.
Journal Article
A roadmap for the study of conscious audition and its neural basis
by
Cariani, Peter A.
,
Dykstra, Andrew R.
,
Gutschalk, Alexander
in
Animals
,
Audition
,
Auditory Perception
2017
How and which aspects of neural activity give rise to subjective perceptual experience—i.e. conscious perception—is a fundamental question of neuroscience. To date, the vast majority of work concerning this question has come from vision, raising the issue of generalizability of prominent resulting theories. However, recent work has begun to shed light on the neural processes subserving conscious perception in other modalities, particularly audition. Here, we outline a roadmap for the future study of conscious auditory perception and its neural basis, paying particular attention to how conscious perception emerges (and of which elements or groups of elements) in complex auditory scenes. We begin by discussing the functional role of the auditory system, particularly as it pertains to conscious perception. Next, we ask: what are the phenomena that need to be explained by a theory of conscious auditory perception? After surveying the available literature for candidate neural correlates, we end by considering the implications that such results have for a general theory of conscious perception as well as prominent outstanding questions and what approaches/techniques can best be used to address them.
This article is part of the themed issue ‘Auditory and visual scene analysis’.
Journal Article
A review on speech separation in cocktail party environment: challenges and approaches
by
Gupta, Manish
,
Agrawal, Jharna
,
Garg, Hitendra
in
Beamforming
,
Deep learning
,
Machine learning
2023
The Cocktail party problem, which is tracing and identifying a specific speaker’s speech while numerous speakers communicate concurrently is one of the crucial problems still to be addressed for automated speech recognition (ASR) and speaker recognition. In this study, we attempt to thoroughly explore traditional methods for speech separation in a cocktail party environment and further analyze traditional single-channel methods for example source-driven methods like Computational Auditory Scene Analysis (CASA), data-driven methods like non-negative matrix factorization (NMF), model-driven methods, customary multi-channel methods such as beamforming, blind source separation for multi-channel and the newly developed deep learning approaches such as meta-learning based methods, self-supervised learning. This paper further accentuates numerous datasets and evaluation metrics in the domain of speech processing & brings out the comparison between traditional methods and methods based on deep learning for speech separation. This study provides a basic understanding and comprehensive knowledge of state-of-the-art researches in the area of speech separation and serves as a brief overview to new researchers.
Journal Article
An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition
by
İnce, Gökhan
,
Bayram, Barış
in
acoustic event recognition
,
acoustic novelty detection
,
acoustic scene analysis
2021
Acoustic scene analysis (ASA) relies on the dynamic sensing and understanding of stationary and non-stationary sounds from various events, background noises and human actions with objects. However, the spatio-temporal nature of the sound signals may not be stationary, and novel events may exist that eventually deteriorate the performance of the analysis. In this study, a self-learning-based ASA for acoustic event recognition (AER) is presented to detect and incrementally learn novel acoustic events by tackling catastrophic forgetting. The proposed ASA framework comprises six elements: (1) raw acoustic signal pre-processing, (2) low-level and deep audio feature extraction, (3) acoustic novelty detection (AND), (4) acoustic signal augmentations, (5) incremental class-learning (ICL) (of the audio features of the novel events) and (6) AER. The self-learning on different types of audio features extracted from the acoustic signals of various events occurs without human supervision. For the extraction of deep audio representations, in addition to visual geometry group (VGG) and residual neural network (ResNet), time-delay neural network (TDNN) and TDNN based long short-term memory (TDNN–LSTM) networks are pre-trained using a large-scale audio dataset, Google AudioSet. The performances of ICL with AND using Mel-spectrograms, and deep features with TDNNs, VGG, and ResNet from the Mel-spectrograms are validated on benchmark audio datasets such as ESC-10, ESC-50, UrbanSound8K (US8K), and an audio dataset collected by the authors in a real domestic environment.
Journal Article
Past review, current progress, and challenges ahead on the cocktail party problem
by
Chang, Xuan-kai
,
Wang, Shuai
,
Weng, Chao
in
Automatic speech recognition
,
Beamforming
,
Clustering
2018
The cocktail party problem, i.e., tracing and recognizing the speech of a specific speaker when multiple speakers talk simultaneously, is one of the critical problems yet to be solved to enable the wide application of automatic speech recognition (ASR) systems. In this overview paper, we review the techniques proposed in the last two decades in attacking this problem. We focus our discussions on the speech separation problem given its central role in the cocktail party environment, and describe the conventional single-channel techniques such as computational auditory scene analysis (CASA), non-negative matrix factorization (NMF) and generative models, the conventional multi-channel techniques such as beamforming and multi-channel blind source separation, and the newly developed deep learning-based techniques, such as deep clustering (DPCL), the deep attractor network (DANet), and permutation invariant training (PIT). We also present techniques developed to improve ASR accuracy and speaker identification in the cocktail party environment. We argue effectively exploiting information in the microphone array, the acoustic training set, and the language itself using a more powerful model. Better optimization objective and techniques will be the approach to solving the cocktail party problem.
Journal Article
Effective Video Scene Analysis for a Nanosatellite Based on an Onboard Deep Learning Method
2023
The latest advancements in satellite technology have allowed us to obtain video imagery from satellites. Nanosatellites are becoming widely used for earth-observing missions as they require a low budget and short development time. Thus, there is a real interest in using nanosatellites with a video payload camera, especially for disaster monitoring and fleet tracking. However, as video data requires much storage and high communication costs, it is challenging to use nanosatellites for such missions. This paper proposes an effective onboard deep-learning-based video scene analysis method to reduce the high communication cost. The proposed method will train a CNN+LSTM-based model to identify mission-related sceneries such as flood-disaster-related scenery from satellite videos on the ground and then load the model onboard the nanosatellite to perform the scene analysis before sending the video data to the ground. We experimented with the proposed method using Nvidia Jetson TX2 as OBC and achieved an 89% test accuracy. Additionally, by implementing our approach, we can minimize the nanosatellite video data download cost by 30% which allows us to send the important mission video payload data to the ground using S-band communication. Therefore, we believe that our new approach can be effectively applied to obtain large video data from a nanosatellite.
Journal Article
Interoperability-Enhanced Knowledge Management in Law Enforcement: An Integrated Data-Driven Forensic Ontological Approach to Crime Scene Analysis
by
Tsiantos, Vassilis
,
Garoufallou, Emmanouel
,
Spyropoulos, Alexandros Z.
in
Algorithms
,
Automation
,
Crime
2023
Nowadays, more and more sciences are involved in strengthening the work of law enforcement authorities. Scientific documentation is evidence highly respected by the courts in administering justice. As the involvement of science in solving crimes increases, so does human subjectivism, which often leads to wrong conclusions and, consequently, to bad judgments. From the above arises the need to create a single information system that will be fed with scientific evidence such as fingerprints, genetic material, digital data, forensic photographs, information from the forensic report, etc., and also investigative data such as information from witnesses’ statements, the apology of the accused, etc., from various crime scenes that will be able, through formal reasoning procedure, to conclude possible perpetrators. The present study examines a proposal for developing an information system that can be a basis for creating a forensic ontology—a semantic representation of the crime scene—through descriptive logic in the owl semantic language. The Interoperability-Enhanced information system to be developed could assist law enforcement authorities in solving crimes. At the same time, it would promote closer cooperation between academia, civil society, and state institutions by fostering a culture of engagement for the common good.
Journal Article
Animal models for auditory streaming
2017
Sounds in the natural environment need to be assigned to acoustic sources to evaluate complex auditory scenes. Separating sources will affect the analysis of auditory features of sounds. As the benefits of assigning sounds to specific sources accrue to all species communicating acoustically, the ability for auditory scene analysis is widespread among different animals. Animal studies allow for a deeper insight into the neuronal mechanisms underlying auditory scene analysis. Here, we will review the paradigms applied in the study of auditory scene analysis and streaming of sequential sounds in animal models. We will compare the psychophysical results from the animal studies to the evidence obtained in human psychophysics of auditory streaming, i.e. in a task commonly used for measuring the capability for auditory scene analysis. Furthermore, the neuronal correlates of auditory streaming will be reviewed in different animal models and the observations of the neurons’ response measures will be related to perception. The across-species comparison will reveal whether similar demands in the analysis of acoustic scenes have resulted in similar perceptual and neuronal processing mechanisms in the wide range of species being capable of auditory scene analysis.
This article is part of the themed issue ‘Auditory and visual scene analysis’.
Journal Article
Preliminary Considerations for Crime Scene Analysis in Cases of Animals Affected by Homemade Ammonium Nitrate and Aluminum Powder Anti-Personnel Landmines in Colombia: Characteristics and Effects
by
Severin, Krešimir
,
Toledo González, Víctor
,
Farías Roldán, Gustavo Adolfo
in
2401.06 Ecología animal
,
Aluminum
,
ammonium nitrate
2022
During the armed conflict in Colombia, homemade improvised antipersonnel landmines were used to neutralize the adversary. Many active artifacts remain buried, causing damage to biodiversity by exploding. The extensive literature describes the effects and injuries caused to humans by conventional landmines. However, there is considerably less information on the behavior and effects of homemade antipersonnel landmines on fauna and good field investigation practices. Our objectives were to describe the characteristics of a controlled explosion of a homemade antipersonnel landmine (using ammonium nitrate as an explosive substance), to compare the effectiveness of some evidence search patterns used in forensic investigation, and to determine the effects on a piece of an animal carcass. The explosion generated a shock wave and an exothermic reaction, generating physical effects on the ground and surrounding structures near the point of explosion. The amputation of the foot in direct contact with the device during the explosion and multiple fractures were the main effects on the animal carcass. Finally, it was determined that finding evidence was more effective in a smaller search area. Many factors can influence the results, which must be weighed when interpreting the results, as discussed in this manuscript.
Journal Article
Perception of Recorded Music With Hearing Aids: Compression Differentially Affects Musical Scene Analysis and Musical Sound Quality
2025
Hearing aids have traditionally been designed to facilitate speech perception. With regards to music perception, previous work indicates that hearing aid users frequently complain about music sound quality. Yet, the effects of hearing aid amplification on musical perception abilities are largely unknown. This study aimed to investigate the effects of hearing aid amplification and dynamic range compression (DRC) settings on musical scene analysis (MSA) abilities and sound quality ratings (SQR) using polyphonic music recordings. Additionally, speech reception thresholds in noise (SRT) were measured. Thirty-three hearing aid users with moderate to severe hearing loss participated in three conditions: unaided, and aided with either slow or fast DRC settings. Overall, MSA abilities, SQR and SRT significantly improved with the use of hearing aids compared to the unaided condition. Yet, differences were observed regarding the choice of compression settings. Fast DRC led to better MSA performance, reflecting enhanced selective listening in musical mixtures, while slow DRC elicited more favorable SQR. Despite these improvements, variability in amplification benefit across DRC settings and tasks remained considerable, with some individuals showing limited or no improvement. These findings highlight a trade-off between scene transparency (indexed by MSA) and perceived sound quality, with individual differences emerging as a key factor in shaping amplification outcomes. Our results underscore the potential benefits of hearing aids for music perception and indicate the need for personalized fitting strategies tailored to task-specific demands.
Journal Article