Catalogue Search | MBRL

Social media bot detection with deep learning methods: a systematic review

by Mathew, Sujith Samuel , Masud, Mohammad Mehedy , Hayawi, Kadhim in Artificial Intelligence , Computational Biology/Bioinformatics , Computational Science and Engineering

2023

Social bots are automated social media accounts governed by software and controlled by humans at the backend. Some bots have good purposes, such as automatically posting information about news and even to provide help during emergencies. Nevertheless, bots have also been used for malicious purposes, such as for posting fake news or rumour spreading or manipulating political campaigns. There are existing mechanisms that allow for detection and removal of malicious bots automatically. However, the bot landscape changes as the bot creators use more sophisticated methods to avoid being detected. Therefore, new mechanisms for discerning between legitimate and bot accounts are much needed. Over the past few years, a few review studies contributed to the social media bot detection research by presenting a comprehensive survey on various detection methods including cutting-edge solutions like machine learning (ML)/deep learning (DL) techniques. This paper, to the best of our knowledge, is the first one to only highlight the DL techniques and compare the motivation/effectiveness of these techniques among themselves and over other methods, especially the traditional ML ones. We present here a refined taxonomy of the features used in DL studies and details about the associated pre-processing strategies required to make suitable training data for a DL model. We summarize the gaps addressed by the review papers that mentioned about DL/ML studies to provide future directions in this field. Overall, DL techniques turn out to be computation and time efficient techniques for social bot detection with better or compatible performance as traditional ML techniques.

Journal Article

Share this book

Add to My Shelf

Detection and impact estimation of social bots in the Chilean Twitter network

by Mendoza, Marcelo , Providel, Eliana , Valenzuela, Sebastián in 639/705/117 , 639/705/258 , Bot detection

2024

The rise of bots that mimic human behavior represents one of the most pressing threats to healthy information environments on social media. Many bots are designed to increase the visibility of low-quality content, spread misinformation, and artificially boost the reach of brands and politicians. These bots can also disrupt civic action coordination, such as by flooding a hashtag with spam and undermining political mobilization. Social media platforms have recognized these malicious bots’ risks and implemented strict policies and protocols to block automated accounts. However, effective bot detection methods for Spanish are still in their early stages. Many studies and tools used for Spanish are based on English-language models and lack performance evaluations in Spanish. In response to this need, we have developed a method for detecting bots in Spanish called Botcheck. Botcheck was trained on a collection of Spanish-language accounts annotated in Twibot-20, a large-scale dataset featuring thousands of accounts annotated by humans in various languages. We evaluated Botcheck’s performance on a large set of labeled accounts and found that it outperforms other competitive methods, including deep learning-based methods. As a case study, we used Botcheck to analyze the 2021 Chilean Presidential elections and discovered evidence of bot account intervention during the electoral term. In addition, we conducted an external validation of the accounts detected by Botcheck in the case study and found our method to be highly effective. We have also observed differences in behavior among the bots that are following the social media accounts of official presidential candidates.

Journal Article

Share this book

Add to My Shelf

Creating a Bot-tleneck for malicious AI: Psychological methods for bot detection

by Oppenheimer, Daniel M. , Rodriguez, Christopher in Artificial Intelligence , Behavioral Science and Psychology , Cognitive Psychology

2024

The standard approach for detecting and preventing bots from doing harm online involves CAPTCHAs. However, recent AI research, including our own in this manuscript, suggests that bots can complete many common CAPTCHAs with ease. The most effective methodology for identifying potential bots involves completing image-processing, causal-reasoning based, free-response questions that are hand coded by human analysts. However, this approach is labor intensive, slow, and inefficient. Moreover, with the advent of Generative AI such as GPT and Bard, it may soon be obsolete. Here, we develop and test various automated, bot-screening questions, grounded in psychological research, to serve as a proactive screen against bots. Utilizing hand coded free-response questions in the naturalistic domain of MTurkers recruited for a Qualtrics survey, we identify 18.9% of our sample to be potential bots, whereas Google’s reCAPTCHA V3 identified only 1.7% to be potential bots. We then look at the performance of these potential bots on our novel bot-screeners, each of which has different strengths and weaknesses but all of which outperform CAPTCHAs.

Journal Article

Share this book

Add to My Shelf

DeeProBot: a hybrid deep neural network model for social bot detection based on user profile data

by Mathew, Sujith , Venugopal, Neethu , Hayawi, Kadhim in Accounts , Applications of Graph Theory and Complex Networks , Artificial neural networks

2022

Use of online social networks (OSNs) undoubtedly brings the world closer. OSNs like Twitter provide a space for expressing one’s opinions in a public platform. This great potential is misused by the creation of bot accounts, which spread fake news and manipulate opinions. Hence, distinguishing genuine human accounts from bot accounts has become a pressing issue for researchers. In this paper, we propose a framework based on deep learning to classify Twitter accounts as either ‘human’ or ‘bot.’ We use the information from user profile metadata of the Twitter account like description, follower count and tweet count. We name the framework ‘DeeProBot,’ which stands for Deep Profile-based Bot detection framework. The raw text from the description field of the Twitter account is also considered a feature for training the model by embedding the raw text using pre-trained Global Vectors (GLoVe) for word representation. Using only the user profile-based features considerably reduces the feature engineering overhead compared with that of user timeline-based features like user tweets and retweets. DeeProBot handles mixed types of features including numerical, binary, and text data, making the model hybrid. The network is designed with long short-term memory (LSTM) units and dense layers to accept and process the mixed input types. The proposed model is evaluated on a collection of publicly available labeled datasets. We have designed the model to make it generalizable across different datasets. The model is evaluated using two ways: testing on a hold-out set of the same dataset; and training with one dataset and testing with a different dataset. With these experiments, the proposed model achieved AUC as high as 0.97 with a selected set of features.

Journal Article

Share this book

Add to My Shelf

Bot, or not? Comparing three methods for detecting social bots in five political discourses

by Samula, Paul , Klinger, Ulrike , Martini, Franziska in Accounts , Automation , Communication

2021

Social bots – partially or fully automated accounts on social media platforms – have not only been widely discussed, but have also entered political, media and research agendas. However, bot detection is not an exact science. Quantitative estimates of bot prevalence vary considerably and comparative research is rare. We show that findings on the prevalence and activity of bots on Twitter depend strongly on the methods used to identify automated accounts. We search for bots in political discourses on Twitter, using three different bot detection methods: Botometer, Tweetbotornot and “heavy automation”. We drew a sample of 122,884 unique user Twitter accounts that had produced 263,821 tweets contributing to five political discourses in five Western democracies. While all three bot detection methods classified accounts as bots in all our cases, the comparison shows that the three approaches produce very different results. We discuss why neither manual validation nor triangulation resolves the basic problems, and conclude that social scientists studying the influence of social bots on (political) communication and discourse dynamics should be careful with easy-to-use methods, and consider interdisciplinary research.

Journal Article

Share this book

Add to My Shelf

FedKG: A Knowledge Distillation-Based Federated Graph Method for Social Bot Detection

by Wang, Xiujuan , Zheng, Kangfeng , Wang, Keke in Algorithms , Classification , Data mining

2024

Malicious social bots pose a serious threat to social network security by spreading false information and guiding bad opinions in social networks. The singularity and scarcity of single organization data and the high cost of labeling social bots have given rise to the construction of federated models that combine federated learning with social bot detection. In this paper, we first combine the federated learning framework with the Relational Graph Convolutional Neural Network (RGCN) model to achieve federated social bot detection. A class-level cross entropy loss function is applied in the local model training to mitigate the effects of the class imbalance problem in local data. To address the data heterogeneity issue from multiple participants, we optimize the classical federated learning algorithm by applying knowledge distillation methods. Specifically, we adjust the client-side and server-side models separately: training a global generator to generate pseudo-samples based on the local data distribution knowledge to correct the optimization direction of client-side classification models, and integrating client-side classification models’ knowledge on the server side to guide the training of the global classification model. We conduct extensive experiments on widely used datasets, and the results demonstrate the effectiveness of our approach in social bot detection in heterogeneous data scenarios. Compared to baseline methods, our approach achieves a nearly 3–10% improvement in detection accuracy when the data heterogeneity is larger. Additionally, our method achieves the specified accuracy with minimal communication rounds.

Journal Article

Share this book

Add to My Shelf

CB-MTE: Social Bot Detection via Multi-Source Heterogeneous Feature Fusion

by Zhang, Chuang , Lei, Chao , Xiao, Yuzhi in Algorithms , Behavior , Behavior evolution

2025

Social bots increasingly mimic real users and collaborate in large-scale influence campaigns, distorting public perception and making their detection both critical and challenging. Traditional bot detection methods, constrained by single-source features, often fail to capture the complete behavioral and contextual characteristics of social bots, especially their dynamic behavioral evolution and group coordination tactics, resulting in feature incompleteness and reduced detection performance. To address this challenge, we propose CB-MTE, a social bot detection framework based on multi-source heterogeneous feature fusion. CB-MTE adopts a hierarchical architecture: user metadata is used to construct behavioral portraits, deep semantic representations are extracted from textual content via DistilBERT, and community-aware graph embeddings are learned through a combination of random walk and Skip-gram modeling. To mitigate feature redundancy and preserve structural consistency, manifold learning is applied for nonlinear dimensionality reduction, ensuring both local and global topology are maintained. Finally, a CatBoost-based collaborative reasoning mechanism enhances model robustness through ordered target encoding and symmetric tree structures. Experiments on the TwiBot-22 benchmark dataset demonstrate that CB-MTE significantly outperforms mainstream detection models in recognizing dynamic behavioral traits and detecting collaborative bot activities. These results confirm the framework’s capability to capture the complete behavioral and contextual characteristics of social bots through multi-source feature integration.

Journal Article

Share this book

Add to My Shelf

Machine learning-based social media bot detection: a comprehensive literature review

by Aljabri, Malak , Shaahid, Afrah , Zagrouba, Rachid in Activities of daily living , Agriculture , Algorithms

2023

In today’s digitalized era, Online Social Networking platforms are growing to be a vital aspect of each individual’s daily life. The availability of the vast amount of information and their open nature attracts the interest of cybercriminals to create malicious bots. Malicious bots in these platforms are automated or semi-automated entities used in nefarious ways while simulating human behavior. Moreover, such bots pose serious cyber threats and security concerns to society and public opinion. They are used to exploit vulnerabilities for illicit benefits such as spamming, fake profiles, spreading inappropriate/false content, click farming, hashtag hijacking, and much more. Cybercriminals and researchers are always engaged in an arms race as new and updated bots are created to thwart ever-evolving detection technologies. This literature review attempts to compile and compare the most recent advancements in Machine Learning-based techniques for the detection and classification of bots on five primary social media platforms namely Facebook, Instagram, LinkedIn, Twitter, and Weibo. We bring forth a concise overview of all the supervised, semi-supervised, and unsupervised methods, along with the details of the datasets provided by the researchers. Additionally, we provide a thorough breakdown of the extracted feature categories. Furthermore, this study also showcases a brief rundown of the challenges and opportunities encountered in this field, along with prospective research directions and promising angles to explore.

Journal Article

Share this book

Add to My Shelf

(Un)Trendy Japan: Twitter bots and the 2017 Japanese general election

by Vancel, Róbert , Jozef Michal Mintal in Social networks

2019

Social networking services (SNSs) can significantly impact public life during important political events. Thus, it comes as no surprise that different political actors try to exploit these online platforms for their benefit. Bots constitute a popular tool on SNSs that appears to be able to shape public opinion and disrupt political processes. However, the role of bots during political events in a non-Western context remains largely under-studied. This article addresses the question of the involvement of Twitter bots during electoral campaigns in Japan. In our study, we collected Twitter data over a fourteen-day period in October 2017 using a set of hashtags related to the 2017 Japanese general election. Our dataset includes 905,215 tweets, 665,400 of which were unique tweets. Using a supervised machine learning approach, we first built a custom ensemble classification model for bot detection based on user profile features, with an area under curve (AUC) for the test set of 0.998. Second, in applying our model, we estimate that the impact of Twitter bots in Japan was minor overall. In comparison with similar studies conducted during elections in the US and the UK, the deployment of Twitter bots involved in the 2017 Japanese general election seems to be significantly lower. Finally, given our results on the level of bots on Twitter during the 2017 Japanese general election, we provide various possible explanations for their underuse within a broader socio-political context.

Journal Article

Share this book

Add to My Shelf

Botnet detection using graph-based feature clustering

by Akula, Ravi , Bian, Linkan , Zhang, Fangyan in Big Data , Bot detection , Classification

2017

Detecting botnets in a network is crucial because bots impact numerous areas such as cyber security, finance, health care, law enforcement, and more. Botnets are becoming more sophisticated and dangerous day-by-day, and most of the existing rule based and flow based detection methods may not be capable of detecting bot activities in an efficient and effective manner. Hence, designing a robust and fast botnet detection method is of high significance. In this study, we propose a novel botnet detection methodology based on topological features of nodes within a graph: in degree, out degree, in degree weight, out degree weight, clustering coefficient, node betweenness, and eigenvector centrality. A self-organizing map clustering method is applied to establish clusters of nodes in the network based on these features. Our method is capable of isolating bots in clusters of small sizes while containing the majority of normal nodes in the same big cluster. Thus, bots can be detected by searching a limited number of nodes. A filtering procedure is also developed to further enhance the algorithm efficiency by removing inactive nodes from consideration. The methodology is verified using the CTU-13 datasets, and benchmarked against a classification-based detection method. The results show that our proposed method can efficiently detect the bots despite their varying behaviors.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter