Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectCountry Of PublicationPublisherSourceTarget AudienceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
11,199
result(s) for
"vocalization"
Sort by:
Bird song : biological themes and variations
by
Catchpole, Clive, author
,
Slater, P. J. B. (Peter James Bramwell), 1942- author
in
Birdsongs.
,
Birds Vocalization.
,
Birds Behavior.
2018
The main thrust of the book, which has been extensively updated for this second edition, is to suggest that the two main functions of song are attracting a mate and defending territory. It shows how this evolutionary pressure has led to the amazing variety we see in the songs of different species throughout the world.
Finding, visualizing, and quantifying latent structure across diverse animal vocal repertoires
by
Gentner, Timothy Q.
,
Sainburg, Tim
,
Thielk, Marvin
in
Acoustics
,
Algorithms
,
Animal communication
2020
Animals produce vocalizations that range in complexity from a single repeated call to hundreds of unique vocal elements patterned in sequences unfolding over hours. Characterizing complex vocalizations can require considerable effort and a deep intuition about each species' vocal behavior. Even with a great deal of experience, human characterizations of animal communication can be affected by human perceptual biases. We present a set of computational methods for projecting animal vocalizations into low dimensional latent representational spaces that are directly learned from the spectrograms of vocal signals. We apply these methods to diverse datasets from over 20 species, including humans, bats, songbirds, mice, cetaceans, and nonhuman primates. Latent projections uncover complex features of data in visually intuitive and quantifiable ways, enabling high-powered comparative analyses of vocal acoustics. We introduce methods for analyzing vocalizations as both discrete sequences and as continuous latent variables. Each method can be used to disentangle complex spectro-temporal structure and observe long-timescale organization in communication.
Journal Article
Analysis of ultrasonic vocalizations from mice using computer vision and machine learning
by
Santana, Gustavo M
,
Bosque Ortiz, Gabriela M
,
Dietrich, Marcelo O
in
Accuracy
,
Animals
,
Auditory communication
2021
Mice emit ultrasonic vocalizations (USVs) that communicate socially relevant information. To detect and classify these USVs, here we describe VocalMat. VocalMat is a software that uses image-processing and differential geometry approaches to detect USVs in audio files, eliminating the need for user-defined parameters. VocalMat also uses computational vision and machine learning methods to classify USVs into distinct categories. In a data set of >4000 USVs emitted by mice, VocalMat detected over 98% of manually labeled USVs and accurately classified ≈86% of the USVs out of 11 USV categories. We then used dimensionality reduction tools to analyze the probability distribution of USV classification among different experimental groups, providing a robust method to quantify and qualify the vocal repertoire of mice. Thus, VocalMat makes it possible to perform automated, accurate, and quantitative analysis of USVs without the need for user inputs, opening the opportunity for detailed and high-throughput analysis of this behavior.
Journal Article
Crowd vocal learning induces vocal dialects in bats: Playback of conspecifics shapes fundamental frequency usage by pups
2017
Vocal learning, the substrate of human language acquisition, has rarely been described in other mammals. Often, group-specific vocal dialects in wild populations provide the main evidence for vocal learning. While social learning is often the most plausible explanation for these intergroup differences, it is usually impossible to exclude other driving factors, such as genetic or ecological backgrounds. Here, we show the formation of dialects through social vocal learning in fruit bats under controlled conditions. We raised 3 groups of pups in conditions mimicking their natural roosts. Namely, pups could hear their mothers' vocalizations but were also exposed to a manipulation playback. The vocalizations in the 3 playbacks mainly differed in their fundamental frequency. From the age of approximately 6 months and onwards, the pups demonstrated distinct dialects, where each group was biased towards its playback. We demonstrate the emergence of dialects through social learning in a mammalian model in a tightly controlled environment. Unlike in the extensively studied case of songbirds where specific tutors are imitated, we demonstrate that bats do not only learn their vocalizations directly from their mothers, but that they are actually influenced by the sounds of the entire crowd. This process, which we term \"crowd vocal learning,\" might be relevant to many other social animals such as cetaceans and pinnipeds.
Journal Article
Comparative EEG reveals general and conspecific vocalization sensitivities in evolutionarily distant mammal species
by
Boros, Marianna
,
Andics, Attila
,
Ferrando, Elodie
in
Adult
,
Animals
,
Auditory Perception - physiology
2025
•We compared neural sensitivities to vocalizations in humans, dogs, and pigs using EEG.•General vocalization sensitivity was found in humans and pigs, in an early time window.•Conspecific vocalization sensitivity was found in later windows in all three species.•The two sensitivities are thus served by distinct, evolutionarily conserved mechanisms.
Mammal brains are tuned to vocal sounds. This tuning is reflected by at least two component processes: general and conspecific vocalization sensitivities. Using non-invasive electrophysiology, we directly compared the temporal characteristics of these sensitivities in three phylogenetically distant mammal species: humans (N = 20), dogs (N = 38) and pigs (N = 11). Event-related potentials (ERPs) were recorded while participants listened to human, dog, pig, and non-vocal sounds. On frontal electrodes, humans and pigs exhibited general vocalization sensitivity: differential ERPs for any vocalizations versus non-vocal sounds, already in early time windows (humans: 162–378 and 534–726 ms, pigs: 72–540 ms), indicating vocalizations’ greater perceived saliency. Conspecific vocalization sensitivity, i.e. ERP differences between conspecific vocalizations and any other sounds were identified in all three species, in later time windows (humans: 468–552 and 572–640 ms, dogs: 314–506 ms; pigs: 254–436 ms), perhaps reflecting categorical processing. These findings reveal that general and conspecific vocalization sensitivities are served by evolutionarily conserved, temporally and functionally distinct neural mechanisms.
Journal Article
Elephant Sound Classification Using Deep Learning Optimization
by
Dewmini, Hiruni
,
Perera, Charith
,
Meedeniya, Dulani
in
Accuracy
,
Acoustic properties
,
Algorithms
2025
Elephant sound identification is crucial in wildlife conservation and ecological research. The identification of elephant vocalizations provides insights into the behavior, social dynamics, and emotional expressions, leading to elephant conservation. This study addresses elephant sound classification utilizing raw audio processing. Our focus lies on exploring lightweight models suitable for deployment on resource-costrained edge devices, including MobileNet, YAMNET, and RawNet, alongside introducing a novel model termed ElephantCallerNet. Notably, our investigation reveals that the proposed ElephantCallerNet achieves an impressive accuracy of 89% in classifying raw audio directly without converting it to spectrograms. Leveraging Bayesian optimization techniques, we fine-tuned crucial parameters such as learning rate, dropout, and kernel size, thereby enhancing the model’s performance. Moreover, we scrutinized the efficacy of spectrogram-based training, a prevalent approach in animal sound classification. Through comparative analysis, the raw audio processing outperforms spectrogram-based methods. In contrast to other models in the literature that primarily focus on a single caller type or binary classification that identifies whether a sound is an elephant voice or not, our solution is designed to classify three distinct caller-types namely roar, rumble, and trumpet.
Journal Article
SeqFusionNet: A hybrid model for sequence-aware and globally integrated acoustic representation
2025
Animals communicate information primarily via their calls, and directly using their vocalizations proves essential for executing species conservation and tracking biodiversity. Conventional visual approaches are frequently limited by distance and surroundings, while call-based monitoring concentrates solely on the animals themselves, proving more effective and straightforward than visual techniques. This paper introduces an animal sound classification model named SeqFusionNet, integrating the sequential encoding of Transformer with the global perception of MLP to achieve robust global feature extraction. Research involved compiling and organizing four common acoustic datasets (pig, bird, urbansound, and marine mammal), with extensive experiments exploring the applicability of vocal features across species and the model’s recognition capabilities. Experimental results validate SeqFusionNet’s efficacy in classifying animal calls: it identifies four pig call types at 95.00% accuracy, nine and six bird categories at 94.52% and 95.24% respectively, fifteen and eleven marine mammal types reaching 96.43% and 97.50% accuracy, while attaining 94.39% accuracy on ten urban sound categories. Comparative analysis shows our method surpasses existing approaches. Beyond matching reference models on UrbanSound8K, SeqFusionNet demonstrates strong robustness and generalization across species. This research offers an expandable, efficient framework for automated bioacoustic monitoring, supporting wildlife preservation, ecological studies, and environmental sound analysis applications.
Journal Article
Explainable classification of goat vocalizations using convolutional neural networks
by
Ntalampiras, Stavros
,
Pesando Gamacchio, Gabriele
in
Acoustic properties
,
Agricultural research
,
Agriculture
2025
Efficient precision livestock farming relies on having timely access to data and information that accurately describes both the animals and their surrounding environment. This paper advances classification of goat vocalizations leveraging a publicly available dataset recorded at diverse farms breeding different species. We developed a Convolutional Neural Network (CNN) architecture tailored for classifying goat vocalizations, yielding an average classification rate of 95.8% in discriminating various goat emotional states. To this end, we suitably augmented the existing dataset using pitch shifting and time stretching techniques boosting the robustness of the trained model. After thoroughly demonstrating the superiority of the designed architecture over the contrasting approaches, we provide insights into the underlying mechanisms governing the proposed CNN by carrying out an extensive interpretation study. More specifically, we conducted an explainability analysis to identify the time-frequency content within goat vocalisations that significantly impacts the classification process. Such an XAI-driven validation not only provides transparency in the decision-making process of the CNN model but also sheds light on the acoustic features crucial for distinguishing the considered classes. Last but not least, the proposed solution encompasses an interactive scheme able to provide valuable information to animal scientists regarding the analysis performed by the model highlighting the distinctive components of the considered goat vocalizations. Our findings underline the effectiveness of data augmentation techniques in bolstering classification accuracy and highlight the significance of leveraging XAI methodologies for validating and interpreting complex machine learning models applied to animal vocalizations.
Journal Article
The breath shape controls intonation of mouse vocalizations
by
Hebling, Alina
,
Wei, Xin Paul
,
MacDonald, Alastair
in
Animal vocalization
,
Animals
,
Auditory stimuli
2024
Intonation in speech is the control of vocal pitch to layer expressive meaning to communication, like increasing pitch to indicate a question. Also, stereotyped patterns of pitch are used to create distinct sounds with different denotations, like in tonal languages and, perhaps, the 10 sounds in the murine lexicon. A basic tone is created by exhalation through a constricted laryngeal voice box, and it is thought that more complex utterances are produced solely by dynamic changes in laryngeal tension. But perhaps, the shifting pitch also results from altering the swiftness of exhalation. Consistent with the latter model, we describe that intonation in most vocalization types follows deviations in exhalation that appear to be generated by the re-activation of the cardinal breathing muscle for inspiration. We also show that the brainstem vocalization central pattern generator, the iRO, can create this breath pattern. Consequently, ectopic activation of the iRO not only induces phonation, but also the pitch patterns that compose most of the vocalizations in the murine lexicon. These results reveal a novel brainstem mechanism for intonation.
Journal Article
Revisiting the syntactic abilities of non-human animals: natural vocalizations and artificial grammar learning
by
ten Cate, Carel
,
Okanoya, Kazuo
in
Acoustic Stimulation - methods
,
Animal vocalization
,
Animals
2012
The domain of syntax is seen as the core of the language faculty and as the most critical difference between animal vocalizations and language. We review evidence from spontaneously produced vocalizations as well as from perceptual experiments using artificial grammars to analyse animal syntactic abilities, i.e. abilities to produce and perceive patterns following abstract rules. Animal vocalizations consist of vocal units (elements) that are combined in a species-specific way to create higher order strings that in turn can be produced in different patterns. While these patterns differ between species, they have in common that they are no more complex than a probabilistic finite-state grammar. Experiments on the perception of artificial grammars confirm that animals can generalize and categorize vocal strings based on phonetic features. They also demonstrate that animals can learn about the co-occurrence of elements or learn simple ‘rules’ like attending to reduplications of units. However, these experiments do not provide strong evidence for an ability to detect abstract rules or rules beyond finite-state grammars. Nevertheless, considering the rather limited number of experiments and the difficulty to design experiments that unequivocally demonstrate more complex rule learning, the question of what animals are able to do remains open.
Journal Article