Catalogue Search | MBRL

Literary and Colloquial Tamil Dialect Identification

by Nagarajan, T , Vijayalakshmi, P , Nanmalar, M in Accuracy , Acknowledgment , Artificial neural networks

2022

Culture and language evolve together. With respect to Tamil, the form of the language that people use nowadays has come a long way from its origins. These days, the old literary form of Tamil is used commonly for writing and the contemporary colloquial Tamil is used for speaking. Human–computer interaction applications require colloquial Tamil (CT) to make it more accessible and easy for the everyday user and, it requires literary Tamil (LT) when information is needed in a formal written format. Continuing the use of LT alongside CT in computer-aided language learning applications will both preserve LT, and provide ease of use via CT, at the same time. Hence, there is a need for the conversion between LT and CT dialects, which demands as a first step, dialect identification. Dialect identification (DID) of LT and CT is an unexplored area of research. There is a considerable research potential in this area, because, (i) LT is standardized while CT is not fully standardized and, (ii) they have only subtle differences. Five methods are explored in our work, which originated from the preliminary work using Gaussian mixture model (GMM) for dialect identification of LT and CT, which offered a motivation with an identification accuracy of 87%. In the current work, keeping the nuances of both these dialects in mind, one other implicit method—convolutional neural network (CNN); two explicit methods—Parallel Phone Recognition (PPR) and Parallel Large Vocabulary Continuous Speech Recognition (P-LVCSR); two versions of the proposed explicit unified phone recognition method (UPR-1 and UPR-2), are explored. These methods vary based on: the need for annotated data, the size of the unit, the way in which modelling is carried out, and the way in which the final decision is made. Even though the average duration of the test utterances is less—4.9 s for LT and 2.5 s for CT—the systems performed well, offering the following identification accuracies: 87.72% (GMM), 93.97% (CNN), 89.24% (PPR), 94.21% (P-LVCSR), 88.57 (UPR-1), 93.53% (UPR-1 with P-LVCSR), 94.55 (UPR-2) and 95.61% (UPR-2 with P-LVCSR).

Journal Article

Share this book

Add to My Shelf

Literary and Colloquial Dialect Identification for Tamil using Acoustic Features

by Nagarajan, T , Vijayalakshmi, P , Nanmalar, M in Automatic speech recognition , Dialects , Error analysis

2024

The evolution and diversity of a language is evident from it's various dialects. If the various dialects are not addressed in technological advancements like automatic speech recognition and speech synthesis, there is a chance that these dialects may disappear. Speech technology plays a role in preserving various dialects of a language from going extinct. In order to build a full fledged automatic speech recognition system that addresses various dialects, an Automatic Dialect Identification (ADI) system acting as the front end is required. This is similar to how language identification systems act as front ends to automatic speech recognition systems that handle multiple languages. The current work proposes a way to identify two popular and broadly classified Tamil dialects, namely literary and colloquial Tamil. Acoustical characteristics rather than phonetics and phonotactics are used, alleviating the requirement of language-dependant linguistic tools. Hence one major advantage of the proposed method is that it does not require an annotated corpus, hence it can be easily adapted to other languages. Gaussian Mixture Models (GMM) using Mel Frequency Cepstral Coefficient (MFCC) features are used to perform the classification task. The experiments yielded an error rate of 12%. Vowel nasalization, as being the reason for this good performance, is discussed. The number of mixture models for the GMM is varied and the performance is analysed.

Paper

Share this book

Add to My Shelf

Literary and Colloquial Tamil Dialect Identification

by Nagarajan, T , Vijayalakshmi, P , Nanmalar, M in Artificial neural networks , Dialects , Implicit methods

2024

Culture and language evolve together. The old literary form of Tamil is used commonly for writing and the contemporary colloquial Tamil is used for speaking. Human-computer interaction applications require Colloquial Tamil (CT) to make it more accessible and easy for the everyday user and, it requires Literary Tamil (LT) when information is needed in a formal written format. Continuing the use of LT alongside CT in computer aided language learning applications will both preserve LT, and provide ease of use via CT, at the same time. Hence there is a need for the conversion between LT and CT dialects, which demands as a first step, dialect identification. Dialect Identification (DID) of LT and CT is an unexplored area of research. In the current work, keeping the nuances of both these dialects in mind, five methods are explored which include two implicit methods - Gaussian Mixture Model (GMM) and Convolutional Neural Network (CNN); two explicit methods - Parallel Phone Recognition (PPR) and Parallel Large Vocabulary Continuous Speech Recognition (P-LVCSR); two versions of the proposed explicit Unified Phone Recognition method (UPR-1 and UPR-2). These methods vary based on: the need for annotated data, the size of the unit, the way in which modelling is carried out, and the way in which the final decision is made. Even though the average duration of the test utterances is less - 4.9s for LT and 2.5s for CT - the systems performed well, offering the following identification accuracies: 87.72% (GMM), 93.97% (CNN), 89.24% (PPR), 94.21% (P-LVCSR), 88.57% (UPR-1), 93.53% (UPR-1 with P-LVCSR), 94.55% (UPR-2), and 95.61% (UPR-2 with P-LVCSR).

Paper

Share this book

Add to My Shelf

A Feature Engineering Approach for Literary and Colloquial Tamil Speech Classification using 1D-CNN

by Vijayalakshmi, P , Nagarajan, T , Nanmalar, M in Ablation , Artificial neural networks , Linguistics

2024

In ideal human computer interaction (HCI), the colloquial form of a language would be preferred by most users, since it is the form used in their day-to-day conversations. However, there is also an undeniable necessity to preserve the formal literary form. By embracing the new and preserving the old, both service to the common man (practicality) and service to the language itself (conservation) can be rendered. Hence, it is ideal for computers to have the ability to accept, process, and converse in both forms of the language, as required. To address this, it is first necessary to identify the form of the input speech, which in the current work is between literary and colloquial Tamil speech. Such a front-end system must consist of a simple, effective, and lightweight classifier that is trained on a few effective features that are capable of capturing the underlying patterns of the speech signal. To accomplish this, a one-dimensional convolutional neural network (1D-CNN) that learns the envelope of features across time, is proposed. The network is trained on a select number of handcrafted features initially, and then on Mel frequency cepstral coefficients (MFCC) for comparison. The handcrafted features were selected to address various aspects of speech such as the spectral and temporal characteristics, prosody, and voice quality. The features are initially analyzed by considering ten parallel utterances and observing the trend of each feature with respect to time. The proposed 1D-CNN, trained using the handcrafted features, offers an F1 score of 0.9803, while that trained on the MFCC offers an F1 score of 0.9895. In light of this, feature ablation and feature combination are explored. When the best ranked handcrafted features, from the feature ablation study, are combined with the MFCC, they offer the best results with an F1 score of 0.9946.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter