A Mathematical Comparison of Machine Learning and Deep Learning Models for Automated Fake News Detection
Abstract
Detecting fake news is a critical challenge in natural language processing (NLP), demanding solutions that balance accuracy, interpretability, and computational efficiency. In this study, we systematically evaluate the mathematical foundations and empirical performance of five representative models for automated fake news classification: three classical machine learning algorithms (Logistic Regression, Random Forest, and Light Gradient Boosting Machine) and two state-of-the-art deep learning architectures (A Lite Bidirectional Encoder Representations from Transformers—ALBERT, and Gated Recurrent Units—GRU). Leveraging the large-scale WELFake dataset, we conduct rigorous experiments under both headline-only and headline-plus-content input scenarios, providing a comprehensive assessment of each model’s capability to capture linguistic, contextual, and semantic cues. We analyze each model’s optimization framework, decision boundaries, and feature importance mechanisms, highlighting the mathematical tradeoffs between representational capacity, generalization, and interpretability. Our results reveal that transformer-based models, particularly ALBERT, achieve state-of-the-art performance, especially when rich textual context is available. Classical ensemble models remain competitive for resource-constrained and interpretable applications. This work advances the mathematical discourse on NLP by bridging theoretical model properties and practical deployment strategies for misinformation detection in high-dimensional, real-world text data.
Related articles
Related articles are currently not available for this article.