Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Detection of Cyberbullying Patterns in Low Resource Colloquial Roman Urdu Microtext using Natural Language Processing, Machine Learning, and Ensemble Techniques
by
Alshahrani, Hani
, Memon, Mohsin Ali
, Dewani, Amirita
, Bhatti, Sania
, Sulaiman, Adel
, Alghamdi, Abdullah
, Shaikh, Asadullah
, Hamdi, Mohammed
in
Artificial intelligence
/ Automation
/ Big Data
/ Classification
/ Communication
/ Computational linguistics
/ Content analysis
/ Coronaviruses
/ COVID-19
/ Cyberbullying
/ cyberbullying detection
/ Data analysis
/ Data mining
/ ensemble learning
/ Hate speech
/ Language
/ Language processing
/ low resource Roman Urdu language
/ Machine learning
/ Medical research
/ Methods
/ Multilingualism
/ Natural language interfaces
/ Natural language processing
/ Pandemics
/ Performance evaluation
/ Social media
/ Social networks
/ Urdu language
2023
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Detection of Cyberbullying Patterns in Low Resource Colloquial Roman Urdu Microtext using Natural Language Processing, Machine Learning, and Ensemble Techniques
by
Alshahrani, Hani
, Memon, Mohsin Ali
, Dewani, Amirita
, Bhatti, Sania
, Sulaiman, Adel
, Alghamdi, Abdullah
, Shaikh, Asadullah
, Hamdi, Mohammed
in
Artificial intelligence
/ Automation
/ Big Data
/ Classification
/ Communication
/ Computational linguistics
/ Content analysis
/ Coronaviruses
/ COVID-19
/ Cyberbullying
/ cyberbullying detection
/ Data analysis
/ Data mining
/ ensemble learning
/ Hate speech
/ Language
/ Language processing
/ low resource Roman Urdu language
/ Machine learning
/ Medical research
/ Methods
/ Multilingualism
/ Natural language interfaces
/ Natural language processing
/ Pandemics
/ Performance evaluation
/ Social media
/ Social networks
/ Urdu language
2023
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Detection of Cyberbullying Patterns in Low Resource Colloquial Roman Urdu Microtext using Natural Language Processing, Machine Learning, and Ensemble Techniques
by
Alshahrani, Hani
, Memon, Mohsin Ali
, Dewani, Amirita
, Bhatti, Sania
, Sulaiman, Adel
, Alghamdi, Abdullah
, Shaikh, Asadullah
, Hamdi, Mohammed
in
Artificial intelligence
/ Automation
/ Big Data
/ Classification
/ Communication
/ Computational linguistics
/ Content analysis
/ Coronaviruses
/ COVID-19
/ Cyberbullying
/ cyberbullying detection
/ Data analysis
/ Data mining
/ ensemble learning
/ Hate speech
/ Language
/ Language processing
/ low resource Roman Urdu language
/ Machine learning
/ Medical research
/ Methods
/ Multilingualism
/ Natural language interfaces
/ Natural language processing
/ Pandemics
/ Performance evaluation
/ Social media
/ Social networks
/ Urdu language
2023
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Detection of Cyberbullying Patterns in Low Resource Colloquial Roman Urdu Microtext using Natural Language Processing, Machine Learning, and Ensemble Techniques
Journal Article
Detection of Cyberbullying Patterns in Low Resource Colloquial Roman Urdu Microtext using Natural Language Processing, Machine Learning, and Ensemble Techniques
2023
Request Book From Autostore
and Choose the Collection Method
Overview
Social media platforms have become a substratum for people to enunciate their opinions and ideas across the globe. Due to anonymity preservation and freedom of expression, it is possible to humiliate individuals and groups, disregarding social etiquette online, inevitably proliferating and diversifying the incidents of cyberbullying and cyber hate speech. This intimidating problem has recently sought the attention of researchers and scholars worldwide. Still, the current practices to sift the online content and offset the hatred spread do not go far enough. One factor contributing to this is the recent prevalence of regional languages in social media, the dearth of language resources, and flexible detection approaches, specifically for low-resource languages. In this context, most existing studies are oriented towards traditional resource-rich languages and highlight a huge gap in recently embraced resource-poor languages. One such language currently adopted worldwide and more typically by South Asian users for textual communication on social networks is Roman Urdu. It is derived from Urdu and written using a Left-to-Right pattern and Roman scripting. This language elicits numerous computational challenges while performing natural language preprocessing tasks due to its inflections, derivations, lexical variations, and morphological richness. To alleviate this problem, this research proposes a cyberbullying detection approach for analyzing textual data in the Roman Urdu language based on advanced preprocessing methods, voting-based ensemble techniques, and machine learning algorithms. The study has extracted a vast number of features, including statistical features, word N-Grams, combined n-grams, and BOW model with TFIDF weighting in different experimental settings using GridSearchCV and cross-validation techniques. The detection approach has been designed to tackle users’ textual input by considering user-specific writing styles on social media in a colloquial and non-standard form. The experimental results show that SVM with embedded hybrid N-gram features produced the highest average accuracy of around 83%. Among the ensemble voting-based techniques, XGboost achieved the optimal accuracy of 79%. Both implicit and explicit Roman Urdu instances were evaluated, and the categorization of severity based on prediction probabilities was performed. Time complexity is also analyzed in terms of execution time, indicating that LR, using different parameters and feature combinations, is the fastest algorithm. The results are promising with respect to standard assessment metrics and indicate the feasibility of the proposed approach in cyberbullying detection for the Roman Urdu language.
Publisher
MDPI AG
Subject
This website uses cookies to ensure you get the best experience on our website.