Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
201
result(s) for
"Auditory scene analysis"
Sort by:
Principles And Applications Of Spatial Hearing
by
Kato, Hiroaki
,
Iwaya, Yukio
,
Iida, Kazuhiro
in
Audiometry
,
Auditory perception
,
Auditory scene analysis
2011
Humans possess a remarkable ability to extract rich three-dimensional information about sound environments simply by analyzing the acoustic signals they receive at their two ears. Research in spatial hearing has evolved from a theoretical discipline studying the basic mechanisms of hearing to a technical discipline focused on designing and implementing increasingly sophisticated spatial auditory display systems. This book contains 39 chapters representing the current state-of-the-art in spatial audio research selected from papers presented in Sendai, Japan, at the First International Workshop on the Principles and Applications of Spatial Hearing.
The cocktail-party problem revisited: early processing and selection of multi-talker speech
How do we recognize what one person is saying when others are speaking at the same time? This review summarizes widespread research in psychoacoustics, auditory scene analysis, and attention, all dealing with early processing and selection of speech, which has been stimulated by this question. Important effects occurring at the peripheral and brainstem levels are mutual masking of sounds and “unmasking” resulting from binaural listening. Psychoacoustic models have been developed that can predict these effects accurately, albeit using computational approaches rather than approximations of neural processing. Grouping—the segregation and streaming of sounds—represents a subsequent processing stage that interacts closely with attention. Sounds can be easily grouped—and subsequently selected—using primitive features such as spatial location and fundamental frequency. More complex processing is required when lexical, syntactic, or semantic information is used. Whereas it is now clear that such processing can take place preattentively, there also is evidence that the processing depth depends on the task-relevancy of the sound. This is consistent with the presence of a feedback loop in attentional control, triggering enhancement of to-be-selected input. Despite recent progress, there are still many unresolved issues: there is a need for integrative models that are neurophysiologically plausible, for research into grouping based on other than spatial or voice-related cues, for studies explicitly addressing endogenous and exogenous attention, for an explanation of the remarkable sluggishness of attention focused on dynamically changing sounds, and for research elucidating the distinction between binaural speech perception and sound localization.
Journal Article
The human auditory brainstem response to running speech reveals a subcortical mechanism for selective attention
by
Reichenbach, Tobias
,
Etard, Octave
,
Forte, Antonio Elia
in
Acoustic Stimulation
,
Adolescent
,
Adult
2017
Humans excel at selectively listening to a target speaker in background noise such as competing voices. While the encoding of speech in the auditory cortex is modulated by selective attention, it remains debated whether such modulation occurs already in subcortical auditory structures. Investigating the contribution of the human brainstem to attention has, in particular, been hindered by the tiny amplitude of the brainstem response. Its measurement normally requires a large number of repetitions of the same short sound stimuli, which may lead to a loss of attention and to neural adaptation. Here we develop a mathematical method to measure the auditory brainstem response to running speech, an acoustic stimulus that does not repeat and that has a high ecological validity. We employ this method to assess the brainstem's activity when a subject listens to one of two competing speakers, and show that the brainstem response is consistently modulated by attention.
Journal Article
Statistics of natural reverberation enable perceptual separation of sound and space
2016
In everyday listening, sound reaches our ears directly from a source as well as indirectly via reflections known as reverberation. Reverberation profoundly distorts the sound from a source, yet humans can both identify sound sources and distinguish environments from the resulting sound, via mechanisms that remain unclear. The core computational challenge is that the acoustic signatures of the source and environment are combined in a single signal received by the ear. Here we ask whether our recognition of sound sources and spaces reflects an ability to separate their effects and whether any such separation is enabled by statistical regularities of real-world reverberation. To first determine whether such statistical regularities exist, we measured impulse responses (IRs) of 271 spaces sampled from the distribution encountered by humans during daily life. The sampled spaces were diverse, but their IRs were tightly constrained, exhibiting exponential decay at frequency-dependent rates: Mid frequencies reverberated longest whereas higher and lower frequencies decayed more rapidly, presumably due to absorptive properties of materials and air. To test whether humans leverage these regularities, we manipulated IR decay characteristics in simulated reverberant audio. Listeners could discriminate sound sources and environments from these signals, but their abilities degraded when reverberation characteristics deviated from those of real-world environments. Subjectively, atypical IRs were mistaken for sound sources. The results suggest the brain separates sound into contributions from the source and the environment, constrained by a prior on natural reverberation. This separation process may contribute to robust recognition while providing information about spaces around us.
Journal Article
No Differences in Auditory Steady-State Responses in Children with Autism Spectrum Disorder and Typically Developing Children
by
Kenet, Tal
,
Mamashli, Fahimeh
,
Hämäläinen, Matti S.
in
Acoustic Stimulation - methods
,
Adolescent
,
Analysis
2024
Auditory steady-state response (ASSR) has been studied as a potential biomarker for abnormal auditory sensory processing in autism spectrum disorder (ASD), with mixed results. Motivated by prior somatosensory findings of group differences in inter-trial coherence (ITC) between ASD and typically developing (TD) individuals at twice the steady-state stimulation frequency, we examined ASSR at 25 and 50 as well as 43 and 86 Hz in response to 25-Hz and 43-Hz auditory stimuli, respectively, using magnetoencephalography. Data were recorded from 22 ASD and 31 TD children, ages 6–17 years. ITC measures showed prominent ASSRs at the stimulation and double frequencies, without significant group differences. These results do not support ASSR as a robust ASD biomarker of abnormal auditory processing in ASD. Furthermore, the previously observed atypical double-frequency somatosensory response in ASD did not generalize to the auditory modality. Thus, the hypothesis about modality-independent abnormal local connectivity in ASD was not supported.
Journal Article
Is predictability salient? A study of attentional capture by auditory patterns
2017
In this series of behavioural and electroencephalography (EEG) experiments, we investigate the extent to which repeating patterns of sounds capture attention. Work in the visual domain has revealed attentional capture by statistically predictable stimuli, consistent with predictive coding accounts which suggest that attention is drawn to sensory regularities. Here, stimuli comprised rapid sequences of tone pips, arranged in regular (REG) or random (RAND) patterns. EEG data demonstrate that the brain rapidly recognizes predictable patterns manifested as a rapid increase in responses to REG relative to RAND sequences. This increase is reminiscent of the increase in gain on neural responses to attended stimuli often seen in the neuroimaging literature, and thus consistent with the hypothesis that predictable sequences draw attention. To study potential attentional capture by auditory regularities, we used REG and RAND sequences in two different behavioural tasks designed to reveal effects of attentional capture by regularity. Overall, the pattern of results suggests that regularity does not capture attention.
This article is part of the themed issue ‘Auditory and visual scene analysis’.
Journal Article
Natural sounds can be reconstructed from human neuroimaging data using deep neural network representation
by
Tanaka, Misato
,
Tsukamoto, Mitsuaki
,
Kamitani, Yukiyasu
in
Acoustic Stimulation
,
Adult
,
Analysis
2025
Reconstruction of perceptual experiences from brain activity offers a unique window into how population neural responses represent sensory information. Although decoding visual content from functional MRI (fMRI) has seen significant success, reconstructing arbitrary sounds remains challenging due to the fine temporal structure of auditory signals and the coarse temporal resolution of fMRI. Drawing on the hierarchical auditory features of deep neural networks (DNNs) with progressively larger time windows and their neural activity correspondence, we introduce a method for sound reconstruction that integrates brain decoding of DNN features and an audio-generative model. DNN features decoded from auditory cortical activity outperformed spectrotemporal and modulation-based features, enabling perceptually plausible reconstructions across diverse sound categories. Behavioral evaluations and objective measures confirmed that these reconstructions preserved short-term spectral and perceptual properties, capturing the characteristic timbre of speech, animal calls, and musical instruments, while the reconstructed sounds did not reproduce longer temporal sequences with fidelity. Leave-category-out analyses indicated that the method generalizes across sound categories. Reconstructions at higher DNN layers and from early auditory regions revealed distinct contributions to decoding performance. Applying the model to a selective auditory attention (“cocktail party”) task further showed that reconstructions reflected the attended sound more strongly than the unattended one in some of the subjects. Despite its inability to reconstruct exact temporal sequences, which may reflect the limited temporal resolution of fMRI, our framework demonstrates the feasibility of mapping brain activity to auditory experiences—a step toward more comprehensive understanding and reconstruction of internal auditory representations.
Journal Article
Harmonicity aids hearing in noise
by
McDermott, Josh H.
,
McPherson, Malinda J.
,
Grace, River C.
in
Auditory Perception
,
Auditory Threshold
,
Auditory thresholds
2022
Hearing in noise is a core problem in audition, and a challenge for hearing-impaired listeners, yet the underlying mechanisms are poorly understood. We explored whether harmonic frequency relations, a signature property of many communication sounds, aid hearing in noise for normal hearing listeners. We measured detection thresholds in noise for tones and speech synthesized to have harmonic or inharmonic spectra. Harmonic signals were consistently easier to detect than otherwise identical inharmonic signals. Harmonicity also improved discrimination of sounds in noise. The largest benefits were observed for two-note up-down “pitch” discrimination and melodic contour discrimination, both of which could be performed equally well with harmonic and inharmonic tones in quiet, but which showed large harmonic advantages in noise. The results show that harmonicity facilitates hearing in noise, plausibly by providing a noise-robust pitch cue that aids detection and discrimination.
Journal Article
Rivalry between pitch and timbre in auditory stream segregation
2025
Two rapidly alternating tones with different pitches may be perceived as one integrated stream when the pitch differences are small or two separated streams when the pitch differences are large. Likewise, timbre differences between two tones may also cause such sequential stream segregation . Moreover, the effects of pitch and timbre on stream segregation may cancel each other out, which is called a trade-off. However, how timbre differences caused by specific patterns of spectral shapes interact with pitch differences and affect stream segregation has been largely unexplored. Therefore, we used stripe tones , in which stripe-like spectral patterns of harmonic complex tones were realized by grouping harmonic components into several bands based on harmonic numbers and removing harmonic components in every other band. Here, we show that 2- and 4-band stimuli elicited distinctive stream segregation against pitch proximity. By contrast, pitch separations dominated stream segregation for 16-band stimuli. The results for 8-band stimuli most clearly showed the trade-off between pitch and timbre on stream segregation. These results suggest that the stimuli with a small number ( ≤ 4) of bands elicit strong stream segregation due to sharp timbral contrasts between stripe-like spectral patterns, and that the auditory system looks to be limited in integrating blocks of frequency components dispersed over frequency and time.
Journal Article
Normal hearing is not enough to guarantee robust encoding of suprathreshold features important in everyday communication
by
Ruggles, Dorea
,
Bharadwaj, Hari
,
Shinn-Cunningham, Barbara G
in
Acoustics
,
Attention - physiology
,
Audio frequencies
2011
\"Normal hearing\" is typically defined by threshold audibility, even though everyday communication relies on extracting key features of easily audible sound, not on sound detection. Anecdotally, many normal-hearing listeners report difficulty communicating in settings where there are competing sound sources, but the reasons for such difficulties are debated: Do these difficulties originate from deficits in cognitive processing, or differences in peripheral, sensory encoding? Here we show that listeners with clinically normal thresholds exhibit very large individual differences on a task requiring them to focus spatial selective auditory attention to understand one speech stream when there are similar, competing speech streams coming from other directions. These individual differences in selective auditory attention ability are unrelated to age, reading span (a measure of cognitive function), and minor differences in absolute hearing threshold; however, selective attention ability correlates with the ability to detect simple frequency modulation in a clearly audible tone. Importantly, we also find that selective attention performance correlates with physiological measures of how well the periodic, temporal structure of sounds above the threshold of audibility are encoded in early, subcortical portions of the auditory pathway. These results suggest that the fidelity of early sensory encoding of the temporal structure in suprathreshold sounds influences the ability to communicate in challenging settings. Tests like these may help tease apart how peripheral and central deficits contribute to communication impairments, ultimately leading to new approaches to combat the social isolation that often ensues.
Journal Article