Literature Review on Fake News Detection Using Machine Learning and Deep Learning Techniques
Main Article Content
Abstract
The unprecedented proliferation of digital media and social networking platforms has amplified the spread of fake news, posing serious threats to social stability, public trust, democratic integrity, and public health. This paper presents a comprehensive literature review of fake news detection methodologies, tracing the evolution from early rule-based and traditional machine learning approaches to modern deep learning architectures and transformer-based pre-trained language models. We systematically examine foundational frameworks for defining and categorizing misinformation, the application of feature engineering techniques including Bag-of-Words and TF-IDF representations, and the progression through classical classifiers such as Naive Bayes, Support Vector Machines, and Random Forests. We then review deep learning advances including Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM) networks, Bidirectional LSTMs with attention mechanisms, and state-of-the-art transformer models such as BERT, RoBERTa, and ALBERT. The review further examines word embedding techniques including Word2Vec and GloVe, benchmark datasets including LIAR, ISOT, FakeNewsNet, and the Kaggle Fake News Dataset, and standard evaluation frameworks. Research gaps including interpretability, multilingual detection, adversarial robustness, and real-time deployment are synthesized and discussed. The review concludes with a comparative analysis of model performance across approaches and a structured agenda for future research.
Downloads
Article Details
Data Availability Statement
The dataset used in this study is publicly available from the Kaggle Fake News Dataset repository. All implementation details necessary to reproduce the results are provided within the manuscript.
Section

This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Interdisciplinary Journal of AI, Machine Learning & Data Science (IJAIMLDS) are licensed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0).
This license allows others to share, copy, distribute, and adapt the work, provided that proper credit is given to the original author(s) and the source.
Authors retain copyright and grant Interdisciplinary Journal of AI, Machine Learning & Data Science (IJAIMLDS) the right of first publication.
How to Cite
References
[1] Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22–36. https://doi.org/10.1145/3137597.3137600 DOI: https://doi.org/10.1145/3137597.3137600
[2] Zhou, X., & Zafarani, R. (2020). A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys, 53(5), 1–40. https://doi.org/10.1145/3395046 DOI: https://doi.org/10.1145/3395046
[3] Wardle, C., & Derakhshan, H. (2017). Information Disorder: Toward an Interdisciplinary Framework for Research and Policy Making. Council of Europe Report DGI(2017)09.
[4] Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146–1151. https://doi.org/10.1126/science.aap9559 DOI: https://doi.org/10.1126/science.aap9559
[5] Lazer, D. M. J., Baum, M. A., Benkler, Y., et al. (2018). The science of fake news. Science, 359(6380), 1094–1096. https://doi.org/10.1126/science.aao2998 DOI: https://doi.org/10.1126/science.aao2998
[6] Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of Economic Perspectives, 31(2), 211–236. https://doi.org/10.1257/jep.31.2.211 DOI: https://doi.org/10.1257/jep.31.2.211
[7] Castillo, C., Mendoza, M., & Poblete, B. (2011). Information credibility on Twitter. Proceedings of the 20th International Conference on World Wide Web (WWW 2011), 675–684. https://doi.org/10.1145/1963405.1963500 DOI: https://doi.org/10.1145/1963405.1963500
[8] Pérez-Rosas, V., Kleinberg, B., Lefevre, A., & Mihalcea, R. (2018). Automatic detection of fake news. Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018), 3391–3401.
[9] Ahmed, H., Traore, I., & Saad, S. (2018). Detecting opinion spams and fake news using text classification. Security and Privacy, 1(1), e9. https://doi.org/10.1002/spy2.9 DOI: https://doi.org/10.1002/spy2.9
[10] Manning, C. D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press. https://doi.org/10.1017/CBO9780511809071 DOI: https://doi.org/10.1017/CBO9780511809071
[11] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018 DOI: https://doi.org/10.1023/A:1022627411411
[12] Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324 DOI: https://doi.org/10.1023/A:1010933404324
[13] Wang, W. Y. (2017). Liar, liar pants on fire: A new benchmark dataset for fake news detection. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017), 422–426. https://doi.org/10.18653/v1/P17-2067 DOI: https://doi.org/10.18653/v1/P17-2067
[14] Rashkin, H., Choi, E., Jang, J. Y., Volkova, S., & Choi, Y. (2017). Truth of varying shades: Analyzing language in fake news and political fact-checking. Proceedings of EMNLP 2017, 2931–2937. https://doi.org/10.18653/v1/D17-1317 DOI: https://doi.org/10.18653/v1/D17-1317
[15] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 DOI: https://doi.org/10.1162/neco.1997.9.8.1735
[16] Kim, Y. (2014). Convolutional neural networks for sentence classification. Proceedings of EMNLP 2014, 1746–1751. https://doi.org/10.3115/v1/D14-1181 DOI: https://doi.org/10.3115/v1/D14-1181
[17] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539 DOI: https://doi.org/10.1038/nature14539
[18] Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. Retrieved from https://www.deeplearningbook.org
[19] Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM. Neural Networks, 18(5–6), 602–610. https://doi.org/10.1016/j.neunet.2005.06.042 DOI: https://doi.org/10.1016/j.neunet.2005.06.042
[20] Karimi, H., & Tang, J. (2019). Learning hierarchical discourse-level structure for fake news detection. Proceedings of NAACL 2019, 3432–3442. https://doi.org/10.18653/v1/N19-1347 DOI: https://doi.org/10.18653/v1/N19-1347
[21] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1), 1929–1958.
[22] Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. (2016). Hierarchical attention networks for document classification. Proceedings of NAACL 2016, 1480–1489. https://doi.org/10.18653/v1/N16-1174 DOI: https://doi.org/10.18653/v1/N16-1174
[23] Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS 2017), 30, 5998–6008.
[24] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of NAACL 2019, 4171–4186. https://doi.org/10.18653/v1/N19-1423 DOI: https://doi.org/10.18653/v1/N19-1423
[25] Liu, Y., Ott, M., Goyal, N., et al. (2019). RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692. https://arxiv.org/abs/1907.11692
[26] Lan, Z., Chen, M., Goodman, S., et al. (2020). ALBERT: A lite BERT for self-supervised learning of language representations. Proceedings of ICLR 2020. https://arxiv.org/abs/1909.11942
[27] Kula, S., Choraś, M., Kozik, R., Ksieniewicz, P., & Woźniak, M. (2021). Sentiment analysis for fake news detection by means of neural networks. Computational Science — ICCS 2021, 152–163. https://doi.org/10.1007/978-3-030-77961-0_14 DOI: https://doi.org/10.1007/978-3-030-77961-0_14
[28] Pennington, J., Socher, R., & Manning, C. D. (2014). GloVe: Global vectors for word representation. Proceedings of EMNLP 2014, 1532–1543. https://doi.org/10.3115/v1/D14-1162 DOI: https://doi.org/10.3115/v1/D14-1162
[29] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. Proceedings of ICLR 2013. https://arxiv.org/abs/1301.3781
[30] Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Advances in NeurIPS 2013, 26, 3111–3119.
[31] Qi, P., Cao, J., Yang, T., Guo, J., & Li, J. (2019). Exploiting multi-domain visual information for fake news detection. Proceedings of IEEE ICDM 2019, 518–527. https://doi.org/10.1109/ICDM.2019.00062 DOI: https://doi.org/10.1109/ICDM.2019.00062
[32] Shu, K., Mahudeswaran, D., Wang, S., Lee, D., & Liu, H. (2020). FakeNewsNet: A data repository with news content, social context, and spatiotemporal information. Big Data, 8(3), 171–188. https://doi.org/10.1089/big.2020.0062 DOI: https://doi.org/10.1089/big.2020.0062
[33] Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization. Proceedings of ICLR 2015. https://arxiv.org/abs/1412.6980
[34] Powers, D. M. W. (2011). Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. Journal of Machine Learning Technologies, 2(1), 37–63.
[35] Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you?: Explaining the predictions of any classifier. Proceedings of ACM SIGKDD 2016, 1135–1144. https://doi.org/10.1145/2939672.2939778 DOI: https://doi.org/10.1145/2939672.2939778
[36] Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in NeurIPS 2017, 30. https://arxiv.org/abs/1705.07874
[37] Pennycook, G., & Rand, D. G. (2019). Fighting misinformation on social media using crowdsourced judgments. PNAS, 116(7), 2521–2526. https://doi.org/10.1073/pnas.1806781116 DOI: https://doi.org/10.1073/pnas.1806781116
[38] Bird, S., Klein, E., & Loper, E. (2009). Natural Language Processing with Python. O'Reilly Media. Retrieved from https://www.nltk.org/book/
[39] Jurafsky, D., & Martin, J. H. (2023). Speech and Language Processing (3rd ed. draft). Stanford University. Retrieved from https://web.stanford.edu/~jurafsky/slp3/
[40] Bojanowski, P., Grave, E., Joulin, A., & Mikolov, T. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135–146. https://doi.org/10.1162/tacl_a_00051 DOI: https://doi.org/10.1162/tacl_a_00051
[41] Nguyen, V. H., Sugiyama, K., Nakov, P., & Kan, M. Y. (2023). FANG: Leveraging social context for fake news detection using graph representation. Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM 2023), 1666–1675. https://doi.org/10.1145/3583780.3614920 DOI: https://doi.org/10.1145/3583780.3614920
[42] OpenAI. (2023). GPT-4 Technical Report. arXiv preprint arXiv:2303.08774. https://arxiv.org/abs/2303.08774
[43] Touvron, H., Lavril, T., Izacard, G., et al. (2023). LLaMA: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971. https://arxiv.org/abs/2302.13971
[44] Liu, Z., Zhang, Y., Fan, Y., & Liu, T. (2024). Early detection of fake news via transformer-based partial-content classification. Expert Systems with Applications, 238, 122017. https://doi.org/10.1016/j.eswa.2023.122017 DOI: https://doi.org/10.1016/j.eswa.2023.122017
[45] Rao, A., Romanov, A., & Chen, X. (2024). Efficient fake news detection with curriculum-distilled transformers. Proceedings of EMNLP 2024 Findings, 1124–1135. https://doi.org/10.18653/v1/2024.findings-emnlp.78 DOI: https://doi.org/10.18653/v1/2024.findings-emnlp.78