Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
9,778 result(s) for "Phonetic analysis"
Sort by:
Entropy-Argumentative Concept of Computational Phonetic Analysis of Speech Taking into Account Dialect and Individuality of Phonation
In this article, the concept (i.e., the mathematical model and methods) of computational phonetic analysis of speech with an analytical description of the phenomenon of phonetic fusion is proposed. In this concept, in contrast to the existing methods, the problem of multicriteria of the process of cognitive perception of speech by a person is strictly formally presented using the theoretical and analytical apparatus of information (entropy) theory, pattern recognition theory and acoustic theory of speech formation. The obtained concept allows for determining reliably the individual phonetic alphabet inherent in a person, taking into account their inherent dialect of speech and individual features of phonation, as well as detecting and correcting errors in the recognition of language units. The experiments prove the superiority of the proposed scientific result over such common Bayesian concepts of decision making using the Euclidean-type mismatch metric as a method of maximum likelihood and a method of an ideal observer. The analysis of the speech signal carried out in the metric based on the proposed concept allows, in particular, for establishing reliably the phonetic saturation of speech, which objectively characterizes the environment of speech signal propagation and its source.
Prosodic Disambiguation of Disjunctive Declaratives and Disjunctive Questions in Jordanian Arabic
This study investigates phonological and phonetic details of disjunctive declaratives (ddcls) and alternative questions (altqs) in Arabic. The aim of the phonological and phonetic analyses of these syntactically identical utterances is to find out the cues that are responsible for the disambiguation. Consequently, a production study eliciting ddcls and altqs was run with 20 participants producing 160 utterances (80 ddcls and 80 altqs). Findings reveal that ddcls and altqs are similar in having a global rise-fall contour, but differ in the phonetic implementation of the fall, since minimum F0 values are significantly higher in altqs than in ddcls, suggesting that there is a fall to mid in the former (proposing !H%) and a fall to low in the latter (L%). There are also significant phonological differences in the accentual features between both sentence types, i.e., the conjuncts are always accented in altqs, but they are deaccented in ddcls. The findings are a contribution to the prosody-meaning literature, showing the importance of prosody for syntactic disambiguation. The findings are used to propose a theory for the disambiguation of disjunctive sentences.
Classifying females’ stressed and neutral voices using acoustic–phonetic analysis of vowels: an exploratory investigation with emergency calls
In the present exploratory study, we investigated acoustic–phonetic measures of spoken vowels for detection of female speech under conditions of stress. Eight authentic recorded calls to emergency services received from eight Finnish adult female speakers were chosen for the analysis. Based on the purpose of the call, the recordings were divided into two groups: the stressed group and the neutral group. We chose f0, H1–H2 and centre of gravity as acoustic–phonetic predictors for our final classification models; In previous studies, high f0 has been associated with a stressed voice, but H1–H2 and centre of gravity have not previously been related to speech under stress. On the other hand, H1–H2 has been used to detect non-modal voice qualities, such as a creaky or breathy voice, and similar voice qualities have been observed in stressed speech. Furthermore, indications exist that in speech under stress, acoustic energy is concentrated in higher frequencies, which consequently increases the centre of gravity. We tested stress detection accuracy with three statistical classifiers: LDA, logistic regression and decision tree. Our results indicated that all the models performed better when they were trained using only the vowel /i/ rather than training them with all Finnish vowels. The use of our best performing model, a logistic regression model based on /i/, yielded 88% correct classification, whereas a logistic regression model trained with all vowels achieved an accuracy of only 81%. We conclude that the results indicate a good stress classification accuracy, although further research with more extensive data is required.
Do long-term acoustic-phonetic features and mel-frequency cepstral coefficients provide complementary speaker-specific information for forensic voice comparison?
A growing number of studies in forensic voice comparison have explored how elements of phonetic analysis and automatic speaker recognition systems may be integrated for optimal speaker discrimination performance. However, few studies have investigated the evidential value of long-term speech features using forensically-relevant speech data. This paper reports an empirical validation study that assesses the evidential strength of the following long-term features: fundamental frequency (F0), formant distributions, laryngeal voice quality, mel-frequency cepstral coefficients (MFCCs), and combinations thereof. Non-contemporaneous recordings with speech style mismatch from 75 male Australian English speakers were analyzed. Results show that 1) MFCCs outperform long-term acoustic phonetic features; 2) source and filter features do not provide considerably complementary speaker-specific information; and 3) the addition of long-term phonetic features to an MFCCs-based system does not lead to meaningful improvement in system performance. Implications for the complementarity of phonetic analysis and automatic speaker recognition systems are discussed. •The evidential strength of long-term acoustic phonetic features, MFCCs, and combinations thereof was evaluated.•Non-contemporaneous recordings with speech style mismatch from 75 male Australian English speakers were analyzed.•MFCCs consistently outperform long-term acoustic phonetic features.•Source and filter features do not provide considerably complementary speaker-specific information.•The addition of long-term phonetic features to an MFCCs-based system does not lead to meaningful improvement in system performance.
Acoustic and auditory relation of creak/creaky settings in telephone call and direct recording samples: An approach in the light of forensic phonetics
This study assessed the possible relationship between acoustic and auditory analyses of vocal samples extracted from telephone calls and direct recordings. Samples from a database of auditory assessments of Voice Quality (VQ) performed by experienced phoneticians using Vocal Profile Analysis were used. Analyses were made of recordings of spontaneous speech samples from five men (aged 20–56, diagnosed with dysphonia), obtained simultaneously by telephone calls and direct recordings. Acoustic analysis was conducted by extracting 20 melodic measurements and VQ descriptors, in an automated manner using a script in the PRAAT software. These measurements were then compared with the results of database analyses using descriptive statistics. The results showed that, despite the differences obtained between the recording channels, acoustic parameters associated with voice quality, such as jitter, shimmer, and harmonic-to-noise, presented consistent correlations with perceptual judgments of non-modal phonation, especially in segments with creak or creaky voice. The analysis revealed that the telephone channel tended to smooth out abrupt peaks in the acoustic measurements, but did not compromise the identification of predominant voice quality patterns. This stability suggests that, even with the inherent signal degradation in telephone transmission, automated acoustic analyses can provide relevant and reliable information for forensic phonetic analysis. We conclude that the integration of acoustic and perceptual analyses, even using recordings from different channels, is feasible and valuable for the forensic assessment of voice quality. These findings highlight the importance of a careful methodological design that considers the particularities of the recording channels and reinforce the potential of forensic phonetics to provide robust evidence in legal contexts.
Natural infant-directed speech facilitates neural tracking of prosody
•We investigate infants’ tracking of natural infant- and adult-directed speech.•Mothers enhance prosodic stress in infant-directed speech.•Infants track the prosodic stress and syllable rate for natural speech.•Infant-directed speech facilitates infants’ tracking of prosodic stress. Infants prefer to be addressed with infant-directed speech (IDS). IDS benefits language acquisition through amplified low-frequency amplitude modulations. It has been reported that this amplification increases electrophysiological tracking of IDS compared to adult-directed speech (ADS). It is still unknown which particular frequency band triggers this effect. Here, we compare tracking at the rates of syllables and prosodic stress, which are both critical to word segmentation and recognition. In mother-infant dyads (n=30), mothers described novel objects to their 9-month-olds while infants’ EEG was recorded. For IDS, mothers were instructed to speak to their children as they typically do, while for ADS, mothers described the objects as if speaking with an adult. Phonetic analyses confirmed that pitch features were more prototypically infant-directed in the IDS-condition compared to the ADS-condition. Neural tracking of speech was assessed by speech-brain coherence, which measures the synchronization between speech envelope and EEG. Results revealed significant speech-brain coherence at both syllabic and prosodic stress rates, indicating that infants track speech in IDS and ADS at both rates. We found significantly higher speech-brain coherence for IDS compared to ADS in the prosodic stress rate but not the syllabic rate. This indicates that the IDS benefit arises primarily from enhanced prosodic stress. Thus, neural tracking is sensitive to parents’ speech adaptations during natural interactions, possibly facilitating higher-level inferential processes such as word segmentation from continuous speech.
Calibration of Consonant Perception to Room Reverberation
Purpose: We examined how consonant perception is affected by a preceding speech carrier simulated in the same or a different room, for different classes of consonants. Carrier room, carrier length, and carrier length/target room uncertainty were manipulated. A phonetic feature analysis tested which phonetic categories are influenced by the manipulations in the acoustic context of the carrier. Method: Two experiments were performed, each with nine participants. Targets consisted of 10 or 16 vowel-consonant (VC) syllables presented in one of two strongly reverberant rooms, preceded by a multiple-VC carrier presented in either the same room, a different reverberant room, or an anechoic room. In Experiment 1, the carrier length and the target room randomly varied from trial to trial, whereas in Experiment 2, they were fixed within a block of trials. Results: Overall, a consistent carrier provided an advantage for consonant perception compared to inconsistent carriers, whether in anechoic or differently reverberant rooms. Phonetic analysis showed that carrier inconsistency significantly degraded identification of the manner of articulation, especially for stop consonants and, in one of the rooms, also of voicing. Carrier length and carrier/target uncertainty did not affect adaptation to reverberation for individual phonetic features. The detrimental effects of anechoic and different reverberant carriers on target perception were similar. Conclusions: The strength of calibration varies across different phonetic features, as well as across rooms with different levels of reverberation. Even though place of articulation is the feature that is affected by reverberation the most, it is the manner of articulation and, partially, voicing for which room adaptation is observed.
Effects of Gender, Parental Role, and Time on Infant- and Adult-Directed Read and Spontaneous Speech
Purpose: The study sets out to investigate inter- and intraspeaker variation in German infant-directed speech (IDS) and considers the potential impact that the factors gender, parental involvement, and speech material (read vs. spontaneous speech) may have. In addition, we analyze data from 3 time points prior to and after the birth of the child to examine potential changes in the features of IDS and, particularly also, of adult-directed speech (ADS). Here, the gender identity of a speaker is considered as an additional factor. Method: IDS and ADS data from 34 participants (15 mothers, 19 fathers) is gathered by means of a reading and a picture description task. For IDS, 2 recordings were made when the baby was approximately 6 and 9 months old, respectively. For ADS, an additional recording was made before the baby was born. Phonetic analyses comprise mean fundamental frequency (f0), variation in f0, the 1st 2 formants measured in /i: [epsilon] a u:/, and the vowel space size. Moreover, social and behavioral data were gathered regarding parental involvement and gender identity. Results: German IDS is characterized by an increase in mean f0, a larger variation in f0, vowel- and formant-specific differences, and a larger acoustic vowel space. No effect of gender or parental involvement was found. Also, the phonetic features of IDS were found in both spontaneous and read speech. Regarding ADS, changes in vowel space size in some of the fathers and in mean f0 in mothers were found. Conclusion: Phonetic features of German IDS are robust with respect to the factors gender, parental involvement, speech material (read vs. spontaneous speech), and time. Some phonetic features of ADS changed within the child's first year depending on gender and parental involvement/gender identity. Thus, further research on IDS needs to address also potential changes in ADS.
Does Early Phonetic Differentiation Predict Later Phonetic Development? Evidence from a Longitudinal Study of /voiced alveolar approximant/ Development in Preschool Children
Purpose: We evaluated whether children whose inaccurate /[voiced alveolar approximant]/ productions showed evidence phonetic differentiation with /w/ at 3.5-4.5 years of age improved in /[voiced alveolar approximant]/ production over the next year more than children whose inaccurate productions did not show evidence of such differentiation. We also examined whether speech perception, inhibitory control, and vocabulary size predicted growth in /[voiced alveolar approximant]/. Method: A set of typically developing, monolingual English-speaking preschool children (n = 136) produced tokens of /[voiced alveolar approximant]/- and /w/-initial words at two time points (TPs), at which they were 39-52 and 51-65 months old. Children's productions of /[voiced alveolar approximant]/ and /w/ were narrowly phonetically transcribed. Children's productions at the earlier time point were rated by naïve listeners using a visual analog scale measure of phoneme goodness; these ratings were used to assess the degree of phonetic differentiation between /[voiced alveolar approximant]/ and /w/. Results: Accuracy for both phonemes varied considerably at both TPs. The growth in accuracy of /[voiced alveolar approximant]/ between the two TPs was not predicted by any individual-differences measures, nor by the degree of differentiation between /[voiced alveolar approximant]/ and /w/at the earlier time point. Conclusion: Low vocabulary size, low inhibitory control, poor speech perception, and the absence of early phonetic differentiation are not necessarily limiting factors in predicting /[voiced alveolar approximant]/ growth in individual children in the age range we studied.
Quantifying and Characterizing Phonetic Reduction in Italian Natural Speech
The main purpose of this study is to test a method for the analysis of phonetic variation in natural speech. The method takes into account the continuous nature of the speech flow and allows for the investigation of the systematic variation phenomena that occur in the speech net of the cross-word coarticulation phenomena that are expected in connected speech. We will describe some of the most frequent phonetic variation patterns that may be observed in the speech chain seen as a sequence of syllables, in relation to internal syllabic structure and lexical stress. The present study concerns speech data from the Italian section of the NOCANDO corpus. The data consist of about 1000 syllables extracted from monological speech from different speakers. In two different analysis layers, we attempted to align the “phonological” expected form and observed realisation. The results of this attempt led to the definition of syllabic deletion, substitution, or insertion when the alignment fails. The proposed method provides insight into the phonetic variation processes that can systematically occur in natural speech with relation to specific linguistic structures; in particular, unstressed syllables are most likely to undergo variation phenomena, and systematic differences concern the syllabic position of the segmental change, in that the presence of lexical stress prevents vowel deletion or centralization, but allows for onset changes (such as consonant cluster simplification or lenition).