Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectCountry Of PublicationPublisherSourceTarget AudienceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
49
result(s) for
"Bengali language Texts"
Sort by:
Study of automatic text summarization approaches in different languages
2021
Nowadays we see huge amount of information is available on both, online and offline sources. For single topic we see hundreds of articles are available, containing vast amount of information about it. It is really a difficult task to manually extract the useful information from them. To solve this problem, automatic text summarization systems are developed. Text summarization is a process of extracting useful information from large documents and compressing them into short summary preserving all important content. This survey paper hand out a broad overview on the work done in the field of automatic text summarization in different languages using various text summarization approaches. The focal centre of this survey paper is to present the research done on text summarization on Indian languages such as, Hindi, Punjabi, Bengali, Malayalam, Kannada, Tamil, Marathi, Assamese, Konkani, Nepali, Odia, Sanskrit, Sindhi, Telugu and Gujarati and foreign languages such as Arabic, Chinese, Greek, Persian, Turkish, Spanish, Czeh, Rome, Urdu, Indonesia Bhasha and many more. This paper provides the knowledge and useful support to the beginner scientists in this research area by giving a concise view on various feature extraction methods and classification techniques required for different types of text summarization approaches applied on both Indian and non-Indian languages.
Journal Article
A study on the challenges and opportunities of speech recognition for Bengali language
2022
Speech recognition is a fascinating process that offers the opportunity to interact and command the machine in the field of human-computer interactions. Speech recognition is a language-dependent system constructed directly based on the linguistic and textual properties of any language. Automatic speech recognition (ASR) systems are currently being used to translate speech to text flawlessly. Although ASR systems are being strongly executed in international languages, ASR systems’ implementation in the Bengali language has not reached an acceptable state. In this research work, we sedulously disclose the current status of the Bengali ASR system’s research endeavors. In what follows, we acquaint the challenges that are mostly encountered while constructing a Bengali ASR system. We split the challenges into language-dependent and language-independent challenges and guide how the particular complications may be overhauled. Following a rigorous investigation and highlighting the challenges, we conclude that Bengali ASR systems require specific construction of ASR architectures based on the Bengali language’s grammatical and phonetic structure.
Journal Article
A comprehensive review of Bengali word sense disambiguation
by
Shaikh Soharab Hossain
,
Pal, Rajat Kumar
,
Das, Dawn Debapratim
in
Algorithms
,
Ambiguity
,
Bengali
2020
The entities of communication have an enormous impact on interaction. Textual data is an important attribute of communication. Textual analysis of this data is carried out by the linguistic researchers in various perspectives. It helps to understand the people’s perception by analyzing the contextual data into its various senses. The sense of a polysemous word is varied according to its context. Hence, the process of identifying the proper meaning of a polysemous word with respect to the context is known as word sense disambiguation (WSD). For the extraction of actual meaning, WSD is an essential technique in Natural Language Processing. Over the last two decades, a lot of algorithms have been proposed to solve this linguistic ambiguity problem in various languages. In addition, a number of review papers have been published in various most spoken languages. Even so, it is elevating that there is a discontinuity in the literature when it comes to the techniques of Bengali WSD. This paper confers an extensive survey work regarding approaches of Bengali WSD. It also presents a survey work of the existing dataset of Bengali WSD.
Journal Article
Hate speech detection in low-resourced Indian languages: An analysis of transformer-based monolingual and multilingual models with cross-lingual experiments
2025
Warning: This paper is based on hate speech detection and may contain examples of abusive/ offensive phrases. Cyberbullying, online harassment, etc., via offensive comments are pervasive across different social media platforms like ™Twitter, ™Facebook, ™YouTube, etc. Hateful comments must be detected and eradicated to prevent harassment and violence on social media. In the Natural Language Processing (NLP) domain, the most prevalent task is comment classification, which is challenging, and language models based on transformers are at the forefront of this advancement. This paper intends to analyze the performance of language models based on transformers like BERT, ALBERT, RoBERTa, and DistilBERT on the Indian hate speech datasets over binary classification. Here, we utilize the existing datasets, i.e., HASOC (Hindi and Marathi) and HS-Bangla. So, we evaluate several multilingual language models like MuRIL-BERT, XLM-RoBERTa, etc., few monolingual language models like RoBERTa-Hindi, Maha-BERT (Marathi), Bangla-BERT (Bangla), Assamese-BERT (Assamese), etc., and perform cross-lingual experiment also. For further analyses, we perform multilingual, monolingual, and cross-lingual experiments on our H ate S peech Assamese (HS-Assamese) (Indo-Aryan language family) and H ate S peech Bodo (HS-Bodo) (Sino-Tibetan language family) dataset (HS dataset version 2) also and achieved a promising result. The motivation of the cross-lingual experiment is to encourage researchers to learn about the power of the transformer. Note that no pre-trained language models are currently available for Bodo or any other Sino-Tibetan languages.
Journal Article
Nirmāṇa kāryē byabahr̥ta Bideśī Śramikadēra janya surakṣā nirdēśābalī = oegug-in geunlojaleul wihan anjeon jag-eob giljab-i (geonseol-eob) = safety guide for foreign construction laborers
in
Industrial safety Korea (South)
,
Construction industry Safety measures
,
Industrial hygiene Korea (South)
2008
Optimizing BERT for Bengali Emotion Classification: Evaluating Knowledge Distillation, Pruning, and Quantization
by
Rahman, Rashedur M.
,
Rahman, Md Hasibur
,
Ria, Zinnat Fowzia
in
Classification
,
Compression ratio
,
Digital data
2025
The rapid growth of digital data necessitates advanced natural language processing (NLP) models like BERT (Bidirectional Encoder Representations from Transformers), known for its superior performance in text classification. However, BERT’s size and computational demands limit its practicality, especially in resource-constrained settings. This research compresses the BERT base model for Bengali emotion classification through knowledge distillation (KD), pruning, and quantization techniques. Despite Bengali being the sixth most spoken language globally, NLP research in this area is limited. Our approach addresses this gap by creating an efficient BERT-based model for Bengali text. We have explored 20 combinations for KD, quantization, and pruning, resulting in improved speedup, fewer parameters, and reduced memory size. Our best results demonstrate significant improvements in both speed and efficiency. For instance, in the case of mBERT, we achieved a 3.87× speedup and 4× compression ratio with a combination of Distil + Prune + Quant that reduced parameters from 178 to 46 M, while the memory size decreased from 711 to 178 MB. These results offer scalable solutions for NLP tasks in various languages and advance the field of model compression, making these models suitable for real-world applications in resource-limited environments.
Journal Article