Preprint Article Version 1 This version is not peer-reviewed

Enhancing Fake News Detection with Word Embedding: A Machine Learning and Deep Learning Approach

Version 1 : Received: 29 July 2024 / Approved: 29 July 2024 / Online: 30 July 2024 (00:19:25 CEST)

How to cite: Al-Tarawneh, M. A. B.; Al-irr, O.; Al-Maaitah, K. S.; Kanj, H.; Aly, W. H. F. Enhancing Fake News Detection with Word Embedding: A Machine Learning and Deep Learning Approach. Preprints 2024, 2024072317. https://doi.org/10.20944/preprints202407.2317.v1 Al-Tarawneh, M. A. B.; Al-irr, O.; Al-Maaitah, K. S.; Kanj, H.; Aly, W. H. F. Enhancing Fake News Detection with Word Embedding: A Machine Learning and Deep Learning Approach. Preprints 2024, 2024072317. https://doi.org/10.20944/preprints202407.2317.v1

Abstract

The proliferation of fake news on social media platforms poses significant challenges to information authenticity and public trust. In this study, the impact of various word embedding techniques on the performance of both machine learning and deep learning models for fake news detection is systematically evaluated. The comprehensive TruthSeeker dataset, which spans over a decade of labeled news articles and social media posts, is utilized. Three popular word embedding techniques are analyzed: TF-IDF, Word2Vec, and FastText. These embeddings were tested with multiple machine learning classifiers, including logistic regression, naive Bayes, K-nearest neighbors, support vector machines, multilayer perceptrons, decision trees, and random forests, as well as deep learning models, specifically convolutional neural networks (CNNs). Among machine learning models, the Support Vector Machine (SVM) with TF-IDF achieved the highest accuracy of 99.03%, precision of 98.92%, recall of 99.15%, and F1 score of 99.03%. The Multilayer Perceptron (MLP) also demonstrated strong performance with Word2Vec, reaching an accuracy of 95.24%, precision of 94.95%, recall of 95.50%, and F1 score of 95.07%. For deep learning models, CNNs combined with TF-IDF embeddings showed outstanding results, achieving an accuracy of 98.99%, precision of 98.56%, recall of 99.48%, and F1 score of 99.02%. These results highlight the critical role of word embeddings in enhancing model accuracy and reliability in misinformation detection.

Keywords

Natural language processing; Machine learning; Deep learning; Fake news detection; Word embedding.

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.