Catalogue Search | MBRL

Deep learning for misinformation detection on online social networks: a survey and new perspectives

by Liu, Shaowu , Islam, Md Rafiqul , Wang, Xianzhi in Applications of Graph Theory and Complex Networks , Automation , Celebrities

2020

Recently, the use of social networks such as Facebook, Twitter, and Sina Weibo has become an inseparable part of our daily lives. It is considered as a convenient platform for users to share personal messages, pictures, and videos. However, while people enjoy social networks, many deceptive activities such as fake news or rumors can mislead users into believing misinformation. Besides, spreading the massive amount of misinformation in social networks has become a global risk. Therefore, misinformation detection (MID) in social networks has gained a great deal of attention and is considered an emerging area of research interest. We find that several studies related to MID have been studied to new research problems and techniques. While important, however, the automated detection of misinformation is difficult to accomplish as it requires the advanced model to understand how related or unrelated the reported information is when compared to real information. The existing studies have mainly focused on three broad categories of misinformation: false information, fake news, and rumor detection. Therefore, related to the previous issues, we present a comprehensive survey of automated misinformation detection on (i) false information, (ii) rumors, (iii) spam, (iv) fake news, and (v) disinformation. We provide a state-of-the-art review on MID where deep learning (DL) is used to automatically process data and create patterns to make decisions not only to extract global features but also to achieve better results. We further show that DL is an effective and scalable technique for the state-of-the-art MID. Finally, we suggest several open issues that currently limit real-world implementation and point to future directions along this dimension.

Journal Article

Share this book

Add to My Shelf

Development and validation of a tool for detecting misinformation risk in diet, nutrition, and health content (Diet-MisRAT)

by Reiss, Michael J , Kalea, Anastasia Z , Ruani, Alex in 631/477/2811 , 692/308/409 , 692/499

2026

Misinformation in diet and nutrition is recognised as a major public health threat, with the potential to misguide dietary choices and contribute to preventable harm. To address this, we developed a Misinformation Risk Assessment Model (MisRAM), grounded in the World Health Organization’s hazard risk assessment principles. MisRAM conceptualises misleading content traits as stratifiable agents of informational adverse effects, weighed by their severity and likelihood of increasing recipient susceptibility. Building on this model, we designed the Diet-Nutrition Misinformation Risk Assessment Tool (Diet-MisRAT), a structured instrument that evaluates medium-to-long form content across four risk dimensions (inaccuracy, incompleteness, deceptiveness, health harm), yielding five-tier risk estimates from very low to very high. Validation involved five rounds: expert reviewers, trainee dietitians, postgraduate nutrition students, highly experienced nutrition professionals, and zero-shot prompt-based generative-AI risk detection. Results showed strong to very strong alignment with expert-derived benchmarks, supporting the tool’s interpretability and concurrent validity. ChatGPT demonstrated high test–retest reliability, accuracy, precision, sensitivity, and F1 scores under blinded untuned conditions, suggesting that adequately constructed, expert-designed prompting tools may help overcome training-dataset limitations. Diet-MisRAT offers a scalable, graded alternative to binary detection. Domain-calibrated risk stratification could guide proportionate interventions in content oversight, regulation, education, misinformation inoculation, and infodemic mitigation.

Journal Article

Share this book

Add to My Shelf

MisRoBÆRTa: Transformers versus Misinformation

by Truică, Ciprian-Octavian , Apostol, Elena-Simona in Ablation , Accuracy , benchmark analysis

2022

Misinformation is considered a threat to our democratic values and principles. The spread of such content on social media polarizes society and undermines public discourse by distorting public perceptions and generating social unrest while lacking the rigor of traditional journalism. Transformers and transfer learning proved to be state-of-the-art methods for multiple well-known natural language processing tasks. In this paper, we propose MisRoBÆRTa, a novel transformer-based deep neural ensemble architecture for misinformation detection. MisRoBÆRTa takes advantage of two state-of-the art transformers, i.e., BART and RoBERTa, to improve the performance of discriminating between real news and different types of fake news. We also benchmarked and evaluated the performances of multiple transformers on the task of misinformation detection. For training and testing, we used a large real-world news articles dataset (i.e., 100,000 records) labeled with 10 classes, thus addressing two shortcomings in the current research: (1) increasing the size of the dataset from small to large, and (2) moving the focus of fake news detection from binary classification to multi-class classification. For this dataset, we manually verified the content of the news articles to ensure that they were correctly labeled. The experimental results show that the accuracy of transformers on the misinformation detection problem was significantly influenced by the method employed to learn the context, dataset size, and vocabulary dimension. We observe empirically that the best accuracy performance among the classification models that use only one transformer is obtained by BART, while DistilRoBERTa obtains the best accuracy in the least amount of time required for fine-tuning and training. However, the proposed MisRoBÆRTa outperforms the other transformer models in the task of misinformation detection. To arrive at this conclusion, we performed ample ablation and sensitivity testing with MisRoBÆRTa on two datasets.

Journal Article

Share this book

Add to My Shelf

Identifying multimodal misinformation leveraging novelty detection and emotion recognition

in Emotion recognition , Emotions , False information

2023

With the growing presence of multimodal content on the web, a specific category of fake news is rampant on popular social media outlets. In this category of fake online information, real multimedia contents (images, videos) are used in different but related contexts with manipulated texts to mislead the readers. The presence of seemingly non-manipulated multimedia content reinforces the belief in the associated fabricated textual content. Detecting this category of misleading multimedia fake news is almost impossible without relevance to any prior knowledge. In addition to this, the presence of highly novel and emotion-invoking contents can fuel the rapid dissemination of such fake news. To counter this problem, in this paper, we first introduce a novel multimodal fake news dataset that includes background knowledge (from authenticate sources) of the misleading articles. Second, we design a multimodal framework using Supervised Contrastive Learning (SCL) based novelty detection and Emotion Prediction tasks for fake news detection. We perform extensive experiments to reveal that our proposed model outperforms the state-of-the-art (SOTA) models.

Journal Article

Share this book

Add to My Shelf

Machine Learning-Based Classification of Multi-modal Fact-Checked Misinformation on Social Networks

by Syed, Javeriya Naaz I. , Keole, Ranjit R. in fact-checked data , machine learning , misinformation detection

2025

The rise of misinformation on social networks creates serious problems for public awareness, policy-making, and trust in society. Social media content is getting more complex, often including text, metadata, and multimedia. This makes it essential to have smart systems that can classify misinformation using various signals. This paper introduces a machine learning approach to check the misinformation that uses the MuMiN (Multilingual Multimodal Fact-Checked Misinformation) dataset. This dataset contains annotated claims, supporting evidence, user tweets, and fact-check labels. Structured preprocessing pipeline applied to get the dataset ready for analysis. The textual and structural features were extracted as features. Three machine learning models, Random Forest (RF), Gradient Boosting (GB), and a Stacking Classifier were developed and assessed. These models were evaluated using key performance metrics. The experimental findings indicate that the stacking ensemble regularly surpasses the individual base classifiers, attaining an accuracy rating of 89.12%. This highlights the advantages of combining models to manage complex, noisy, and multimodal social media data. This study emphasizes the value of merging multimodal feature representations with ensemble learning methods for effective and scalable misinformation detection on online platforms.

Journal Article

Share this book

Add to My Shelf

Multi-agent systems and credibility-based advanced scoring mechanism in fact-checking

by Dong, Yihan , Ito, Takayuki in 639/705/117 , 639/705/258 , Chatbots

2026

Fact-checking is crucial as rumours and misinformation negatively impact social networking services (SNS) and online discussions, often leading to the spread of misinformation. Meanwhile, fact-checking with large language models (LLMs) is becoming increasingly popular with the increase in the performance of LLMs. However, the previous works have issues, including overconfidence in the judgment results of LLM and the insufficiency of binary fact-checking due to the text’s complexity. On the other hand, using multiple information sources to make judgments reveals another obstacle: the lack of proper scoring mechanisms. Thus, we propose a framework called multi-agent fact-checking (MAFC), which includes multiple agents with unique information sources to measure the text’s credibility. Specifically, a brand-new scoring mechanism is also used to calculate credibility according to each agent’s judgment results and confidence. We tested our proposed method through several comparative experiments. The results of the experiments prove that the proposed method performs better than other baselines in both the binary fact-checking task and the multi-label fact-checking task. Finally, the challenges and obstacles existing in fact-checking fields, such as the definition standards and dataset creation, are discussed.

Journal Article

Share this book

Add to My Shelf

Exploiting sparsity and statistical dependence in multivariate data fusion: an application to misinformation detection for high-impact events

by Japkowicz, Nathalie , Rexhepi, Egzona , Whitehouse, Ian in Algorithms , Artificial Intelligence , Computer Science

2024

With the evolution of social media, cyberspace has become the de-facto medium for users to communicate during high-impact events such as natural disasters, terrorist attacks, and periods of political unrest. However, during such high-impact events, misinformation can spread rapidly on social media, affecting decision-making and creating social unrest. Identifying the spread of misinformation during high-impact events is a significant data challenge, given the multi-modal data associated with social media posts. Advances in multi-modal learning have shown promise for detecting misinformation; however, key limitations still make this a significant challenge. These limitations include the explicit and efficient modeling of the underlying non-linear associations of multi-modal data geared at misinformation detection. This paper presents a novel avenue of work that demonstrates how to frame the problem of misinformation detection in social media using multi-modal latent variable modeling and presents two novel algorithms capable of modeling the underlying associations of multi-modal data. We demonstrate the effectiveness of the proposed algorithms using simulated data and study their performance in the context of misinformation detection using a popular multi-modal dataset that consists of tweets published during several high-impact events.

Journal Article

Share this book

Add to My Shelf

Detecting and classifying online health misinformation with ‘Content Similarity Measure (CSM)’ algorithm: an automated fact-checking-based approach

by Barve, Yashoda , Saini, Jatinderkumar R. in Algorithms , Compilers , Computer Science

2023

Information dissemination occurs through the 'word of media' in the digital world. Fraudulent and deceitful content, such as misinformation, has detrimental effects on people. An implicit fact-based automated fact-checking technique comprising information retrieval, natural language processing, and machine learning techniques assist in assessing the credibility of content and detecting misinformation. Previous studies focused on linguistic and textual features and similarity measures-based approaches. However, these studies need to gain knowledge of facts, and similarity measures are less accurate when dealing with sparse or zero data. To fill these gaps, we propose a 'Content Similarity Measure (CSM)' algorithm that can perform automated fact-checking of URLs in the healthcare domain. Authors have introduced a novel set of content similarity, domain-specific, and sentiment polarity score features to achieve journalistic fact-checking. An extensive analysis of the proposed algorithm compared with standard similarity measures and machine learning classifiers showed that the ‘content similarity score’ feature outperformed other features with an accuracy of 88.26%. In the algorithmic approach, CSM showed improved accuracy of 91.06% compared to the Jaccard similarity measure with 74.26% accuracy. Another observation is that the algorithmic approach outperformed the feature-based method. To check the robustness of the algorithms, authors have tested the model on three state-of-the-art datasets, viz. CoAID, FakeHealth, and ReCOVery. With the algorithmic approach, CSM showed the highest accuracy of 87.30%, 89.30%, 85.26%, and 88.83% on CoAID, ReCOVery, FakeHealth (Story), and FakeHealth (Release) datasets, respectively. With a feature-based approach, the proposed CSM showed the highest accuracy of 85.93%, 87.97%, 83.92%, and 86.80%, respectively.

Journal Article

Share this book

Add to My Shelf

Misinformation detection: datasets, models and performance

by Chung, Hsin-Hsuan , Chen, Jiangping in Accuracy , Algorithms , Codes

2025

PurposeThis paper aims to understand the characteristics of current misinformation detection studies, including the datasets used by researchers, the computational models or algorithms being developed or applied, and the performance of misinformation detection models or algorithms.Design/methodology/approachWe first identified articles from the Scopus database with inclusion and exclusion criteria. Then a coding scheme was derived from the articles based on research questions. Next, datasets, models, and performance were coded. The paper concluded with answers to research questions and future research directions.FindingsFrom 115 relevant articles published during 2019–2023 on misinformation detection. We found that most studies used previously existing datasets. Twitter (now X) has been the most widely used source for collecting social media misinformation data. The ten most frequently used datasets are identified. Most studies (96.1%) developed or applied machine learning, especially deep learning models. The most advanced current misinformation detection models could achieve pretty high performance. For example, among 104 studies reporting performance with accuracy, 44.2% achieved an accuracy of 0.95 or higher, and 24.0% achieved 0.90–0.94 on accuracy.Research limitations/implicationsOur study only reviewed English articles from 2019–2023 that are included in the Scopus database. Articles that are not included in the Scopus database are not reviewed.Practical implicationsThe high performance of misinformation detection indicates that social media should be able to detect most misinformation if they are willing to do it. However, no system or algorithm could achieve 100% misinformation on performance. Due to the complexity of misinformation, users of social media still need to improve their capabilities of evaluating information on the Internet.Social implicationsThis study provides evidence to policymakers that social media platforms have the capability of detecting most misinformation posted. These platforms are responsible for alerting to suspicious postings with misinformation.Originality/valueThis study identifies datasets, computer models, and performance of models from current misinformation detection research. The findings will help social media companies, computer scientists, and information system designers improve their misinformation detection systems. It will also help students in information science and computer science to study the latest models and algorithms. Information professionals may work with computer scientists to improve datasets used for misinformation detection.

Journal Article

Share this book

Add to My Shelf