Version 1
: Received: 29 July 2024 / Approved: 29 July 2024 / Online: 30 July 2024 (00:19:25 CEST)
How to cite:
Al-Tarawneh, M. A. B.; Al-irr, O.; Al-Maaitah, K. S.; Kanj, H.; Aly, W. H. F. Enhancing Fake News Detection with Word Embedding: A Machine Learning and Deep Learning Approach. Preprints2024, 2024072317. https://doi.org/10.20944/preprints202407.2317.v1
Al-Tarawneh, M. A. B.; Al-irr, O.; Al-Maaitah, K. S.; Kanj, H.; Aly, W. H. F. Enhancing Fake News Detection with Word Embedding: A Machine Learning and Deep Learning Approach. Preprints 2024, 2024072317. https://doi.org/10.20944/preprints202407.2317.v1
Al-Tarawneh, M. A. B.; Al-irr, O.; Al-Maaitah, K. S.; Kanj, H.; Aly, W. H. F. Enhancing Fake News Detection with Word Embedding: A Machine Learning and Deep Learning Approach. Preprints2024, 2024072317. https://doi.org/10.20944/preprints202407.2317.v1
APA Style
Al-Tarawneh, M. A. B., Al-irr, O., Al-Maaitah, K. S., Kanj, H., & Aly, W. H. F. (2024). Enhancing Fake News Detection with Word Embedding: A Machine Learning and Deep Learning Approach. Preprints. https://doi.org/10.20944/preprints202407.2317.v1
Chicago/Turabian Style
Al-Tarawneh, M. A. B., Hassan Kanj and Wael Hosny Fouad Aly. 2024 "Enhancing Fake News Detection with Word Embedding: A Machine Learning and Deep Learning Approach" Preprints. https://doi.org/10.20944/preprints202407.2317.v1
Abstract
The proliferation of fake news on social media platforms poses significant challenges to information authenticity and public trust. In this study, the impact of various word embedding techniques on the performance of both machine learning and deep learning models for fake news detection is systematically evaluated. The comprehensive TruthSeeker dataset, which spans over a decade of labeled news articles and social media posts, is utilized. Three popular word embedding techniques are analyzed: TF-IDF, Word2Vec, and FastText. These embeddings were tested with multiple machine learning classifiers, including logistic regression, naive Bayes, K-nearest neighbors, support vector machines, multilayer perceptrons, decision trees, and random forests, as well as deep learning models, specifically convolutional neural networks (CNNs). Among machine learning models, the Support Vector Machine (SVM) with TF-IDF achieved the highest accuracy of 99.03%, precision of 98.92%, recall of 99.15%, and F1 score of 99.03%. The Multilayer Perceptron (MLP) also demonstrated strong performance with Word2Vec, reaching an accuracy of 95.24%, precision of 94.95%, recall of 95.50%, and F1 score of 95.07%. For deep learning models, CNNs combined with TF-IDF embeddings showed outstanding results, achieving an accuracy of 98.99%, precision of 98.56%, recall of 99.48%, and F1 score of 99.02%. These results highlight the critical role of word embeddings in enhancing model accuracy and reliability in misinformation detection.
Keywords
Natural language processing; Machine learning; Deep learning; Fake news detection; Word embedding.
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.