Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
57
result(s) for
"Elhilali, Mounya"
Sort by:
Modelling auditory attention
2017
Sounds in everyday life seldom appear in isolation. Both humans and machines are constantly flooded with a cacophony of sounds that need to be sorted through and scoured for relevant information—a phenomenon referred to as the ‘cocktail party problem’. A key component in parsing acoustic scenes is the role of attention, which mediates perception and behaviour by focusing both sensory and cognitive resources on pertinent information in the stimulus space. The current article provides a review of modelling studies of auditory attention. The review highlights how the term attention refers to a multitude of behavioural and cognitive processes that can shape sensory processing. Attention can be modulated by ‘bottom-up’ sensory-driven factors, as well as ‘top-down’ task-specific goals, expectations and learned schemas. Essentially, it acts as a selection process or processes that focus both sensory and cognitive resources on the most relevant events in the soundscape; with relevance being dictated by the stimulus itself (e.g. a loud explosion) or by a task at hand (e.g. listen to announcements in a busy airport). Recent computational models of auditory attention provide key insights into its role in facilitating perception in cluttered auditory scenes.
This article is part of the themed issue ‘Auditory and visual scene analysis’.
Journal Article
Push-pull competition between bottom-up and top-down auditory attention to natural soundscapes
2020
In everyday social environments, demands on attentional resources dynamically shift to balance our attention to targets of interest while alerting us to important objects in our surrounds. The current study uses electroencephalography to explore how the push-pull interaction between top-down and bottom-up attention manifests itself in dynamic auditory scenes. Using natural soundscapes as distractors while subjects attend to a controlled rhythmic sound sequence, we find that salient events in background scenes significantly suppress phase-locking and gamma responses to the attended sequence, countering enhancement effects observed for attended targets. In line with a hypothesis of limited attentional resources, the modulation of neural activity by bottom-up attention is graded by degree of salience of ambient events. The study also provides insights into the interplay between endogenous and exogenous attention during natural soundscapes, with both forms of attention engaging a common fronto-parietal network at different time lags. When walking into a busy restaurant or café, our sense of hearing is bombarded with different sounds that our brain has to sort through to make sense of our surroundings. Our brain has to balance the desire to focus our attention on sounds we choose to listen to (such as the friend we are having a conversation with) and sounds that attract our attention (such as the sound of someone else’s phone ringing). Without the ability to be distracted, we might miss a noise that may or may not be crucial to our survival, like the engine roar of an approaching vehicle or a ping notifying us of an incoming email. However, it remains unclear what happens in our brains that enables us to shift our attention to background sounds. To investigate this further, Huang and Elhilali asked 81 participants to focus their attention on a repeating sound while being exposed to background noises from everyday life, such as sounds from a busy café. The experiment showed that when a more noticeable sound happened in the background, such as a loud voice, the participants were more likely to lose attention on their task and miss changes in the tone of the repeating sound. Huang and Elhilali then measured the brain activity of 12 participants as they counted the number of altered tones in a sequence of sounds, again with noise in the background. This revealed that brain waves synchronized with tones that the participants were concentrating on. However, once there was a noticeable event in the background, this tone synchronization was reduced and the brain waves aligned with the background noise. Huang and Elhilali found that distracting noises in the background activate the same region of the brain as sounds we choose to listen to. This demonstrates how background sounds are able to re-direct our attention. These results are consistent with the idea that we have a limited capacity for attention, and that new sensory information can divert brain activity. Having a better understanding of how these processes work could help develop better communication aids for people with impaired hearing, and improve software for interpreting sounds with a noisy background.
Journal Article
Detecting change in stochastic sound sequences
2018
Our ability to parse our acoustic environment relies on the brain's capacity to extract statistical regularities from surrounding sounds. Previous work in regularity extraction has predominantly focused on the brain's sensitivity to predictable patterns in sound sequences. However, natural sound environments are rarely completely predictable, often containing some level of randomness, yet the brain is able to effectively interpret its surroundings by extracting useful information from stochastic sounds. It has been previously shown that the brain is sensitive to the marginal lower-order statistics of sound sequences (i.e., mean and variance). In this work, we investigate the brain's sensitivity to higher-order statistics describing temporal dependencies between sound events through a series of change detection experiments, where listeners are asked to detect changes in randomness in the pitch of tone sequences. Behavioral data indicate listeners collect statistical estimates to process incoming sounds, and a perceptual model based on Bayesian inference shows a capacity in the brain to track higher-order statistics. Further analysis of individual subjects' behavior indicates an important role of perceptual constraints in listeners' ability to track these sensory statistics with high fidelity. In addition, the inference model facilitates analysis of neural electroencephalography (EEG) responses, anchoring the analysis relative to the statistics of each stochastic stimulus. This reveals both a deviance response and a change-related disruption in phase of the stimulus-locked response that follow the higher-order statistics. These results shed light on the brain's ability to process stochastic sound sequences.
Journal Article
A Gestalt inference model for auditory scene segregation
by
Chakrabarty, Debmalya
,
Elhilali, Mounya
in
Acoustic Stimulation
,
Acoustics
,
Artificial neural networks
2019
Our current understanding of how the brain segregates auditory scenes into meaningful objects is in line with a Gestaltism framework. These Gestalt principles suggest a theory of how different attributes of the soundscape are extracted then bound together into separate groups that reflect different objects or streams present in the scene. These cues are thought to reflect the underlying statistical structure of natural sounds in a similar way that statistics of natural images are closely linked to the principles that guide figure-ground segregation and object segmentation in vision. In the present study, we leverage inference in stochastic neural networks to learn emergent grouping cues directly from natural soundscapes including speech, music and sounds in nature. The model learns a hierarchy of local and global spectro-temporal attributes reminiscent of simultaneous and sequential Gestalt cues that underlie the organization of auditory scenes. These mappings operate at multiple time scales to analyze an incoming complex scene and are then fused using a Hebbian network that binds together coherent features into perceptually-segregated auditory objects. The proposed architecture successfully emulates a wide range of well established auditory scene segregation phenomena and quantifies the complimentary role of segregation and binding cues in driving auditory scene segregation.
Journal Article
Decoding contextual influences on auditory perception from primary auditory cortex
by
Shamma, Shihab
,
Elhilali, Mounya
,
Akram, Sahar
in
Acoustic Stimulation
,
adaptation
,
ambiguous percept
2024
Perception can be highly dependent on stimulus context, but whether and how sensory areas encode the context remains uncertain. We used an ambiguous auditory stimulus – a tritone pair – to investigate the neural activity associated with a preceding contextual stimulus that strongly influenced the tritone pair’s perception: either as an ascending or a descending step in pitch. We recorded single-unit responses from a population of auditory cortical cells in awake ferrets listening to the tritone pairs preceded by the contextual stimulus. We find that the responses adapt locally to the contextual stimulus, consistent with human MEG recordings from the auditory cortex under the same conditions. Decoding the population responses demonstrates that cells responding to pitch-changes are able to predict well the context-sensitive percept of the tritone pairs. Conversely, decoding the individual pitch representations and taking their distance in the circular Shepard tone space predicts the opposite of the percept. The various percepts can be readily captured and explained by a neural model of cortical activity based on populations of adapting, pitch and pitch-direction cells, aligned with the neurophysiological responses. Together, these decoding and model results suggest that contextual influences on perception may well be already encoded at the level of the primary sensory cortices, reflecting basic neural response properties commonly found in these areas.
Journal Article
Segregating Complex Sound Sources through Temporal Coherence
by
Shamma, Shihab
,
Elhilali, Mounya
,
Krishnan, Lakshmi
in
Acoustic Stimulation - classification
,
Algorithms
,
Auditory Cortex - physiology
2014
A new approach for the segregation of monaural sound mixtures is presented based on the principle of temporal coherence and using auditory cortical representations. Temporal coherence is the notion that perceived sources emit coherently modulated features that evoke highly-coincident neural response patterns. By clustering the feature channels with coincident responses and reconstructing their input, one may segregate the underlying source from the simultaneously interfering signals that are uncorrelated with it. The proposed algorithm requires no prior information or training on the sources. It can, however, gracefully incorporate cognitive functions and influences such as memories of a target source or attention to a specific set of its attributes so as to segregate it from its background. Aside from its unusual structure and computational innovations, the proposed model provides testable hypotheses of the physiological mechanisms of this ubiquitous and remarkable perceptual ability, and of its psychophysical manifestations in navigating complex sensory environments.
Journal Article
Music in Our Ears: The Biological Bases of Musical Timbre Perception
2012
Timbre is the attribute of sound that allows humans and other animals to distinguish among different sound sources. Studies based on psychophysical judgments of musical timbre, ecological analyses of sound's physical characteristics as well as machine learning approaches have all suggested that timbre is a multifaceted attribute that invokes both spectral and temporal sound features. Here, we explored the neural underpinnings of musical timbre. We used a neuro-computational framework based on spectro-temporal receptive fields, recorded from over a thousand neurons in the mammalian primary auditory cortex as well as from simulated cortical neurons, augmented with a nonlinear classifier. The model was able to perform robust instrument classification irrespective of pitch and playing style, with an accuracy of 98.7%. Using the same front end, the model was also able to reproduce perceptual distance judgments between timbres as perceived by human listeners. The study demonstrates that joint spectro-temporal features, such as those observed in the mammalian primary auditory cortex, are critical to provide the rich-enough representation necessary to account for perceptual judgments of timbre by human listeners, as well as recognition of musical instruments.
Journal Article
Optimized Acoustic Phantom Design for Characterizing Body Sound Sensors
by
Elhilali, Mounya
,
Rennoll, Valerie
,
West, James E.
in
acoustic phantom
,
Acoustics
,
Auscultation
2022
Many commercial and prototype devices are available for capturing body sounds that provide important information on the health of the lungs and heart; however, a standardized method to characterize and compare these devices is not agreed upon. Acoustic phantoms are commonly used because they generate repeatable sounds that couple to devices using a material layer that mimics the characteristics of skin. While multiple acoustic phantoms have been presented in literature, it is unclear how design elements, such as the driver type and coupling layer, impact the acoustical characteristics of the phantom and, therefore, the device being measured. Here, a design of experiments approach is used to compare the frequency responses of various phantom constructions. An acoustic phantom that uses a loudspeaker to generate sound and excite a gelatin layer supported by a grid is determined to have a flatter and more uniform frequency response than other possible designs with a sound exciter and plate support. When measured on an optimal acoustic phantom, three devices are shown to have more consistent measurements with added weight and differing positions compared to a non-optimal phantom. Overall, the statistical models developed here provide greater insight into acoustic phantom design for improved device characterization.
Journal Article
Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex
by
Shamma, Shihab
,
Elhilali, Mounya
,
Klein, David
in
Acoustic Stimulation
,
Action Potentials - physiology
,
Animal Genetics and Genomics
2003
We investigated the hypothesis that task performance can rapidly and adaptively reshape cortical receptive field properties in accord with specific task demands and salient sensory cues. We recorded neuronal responses in the primary auditory cortex of behaving ferrets that were trained to detect a target tone of any frequency. Cortical plasticity was quantified by measuring focal changes in each cell's spectrotemporal response field (STRF) in a series of passive and active behavioral conditions. STRF measurements were made simultaneously with task performance, providing multiple snapshots of the dynamic STRF during ongoing behavior. Attending to a specific target frequency during the detection task consistently induced localized facilitative changes in STRF shape, which were swift in onset. Such modulatory changes may enhance overall cortical responsiveness to the target tone and increase the likelihood of 'capturing' the attended target during the detection task. Some receptive field changes persisted for hours after the task was over and hence may contribute to long-term sensory memory.
Journal Article