Machine Learning and Deep Learning-Based Intrusion Detection Systems: A Comprehensive Review of Datasets, Algorithms, Challenges, Explainability, and Future Research Directions
Main Article Content
Abstract
Cloud computing, Internet of Things, cyber-physical systems and communications networks are vital technologies with a high growth potential that have made the volume and sophistication of cyberattacks more common and challenging than ever; traditional security mechanisms are no longer adequate for modern threat detection in these rapidly evolving technologies. Machine Learning (ML) and Deep Learning (DL) have proven to be promising solutions to build powerful Intrusion Detection Systems (IDSs) that can detect complex and new attacks. This review will be conducted with a systematic review methodology as guided by PRISMA, and will present the comprehensive analysis of the published ML- and DL-based IDS research from 2020 up to 2026. Over 500 studies were reviewed and 120 high-quality publications were chosen for in-depth review. It reviews popular benchmark datasets, such as NSL-KDD, UNSW-NB15, CICIDS2017, CICDDoS2019, Bot-IoT, IoTID20, and TON_IoT, and analyzes the performances of the popular ML algorithms including SVM, KNN, Random Forest, and XGBoost; and DL architectures including CNN, RNN, LSTM, GRU, Autoencoders, Transformers, and Graph Neural Networks. New areas of research covered include Explainable AI (XAI); Federated Learning; Edge Intelligence; Adversarial Learning and Self-Supervised Learning. The results show that the detection accuracy is generally higher for the DL models, while the interpretability and computational efficiency are higher for the ML models. The following are the top problems found: dataset imbalance, concept drift, adversarial attacks, privacy concerns and real-time deployment constraints. The review then presents the current research opportunities in the field of designing scalable, explainable and trustworthy IDS frameworks in future cybersecurity environments
Downloads
Article Details
Section

This work is licensed under a Creative Commons Attribution 4.0 International License.
All articles published in Interdisciplinary Journal of AI, Machine Learning & Data Science (IJAIMLDS) are licensed under the terms of the Creative Commons Attribution 4.0 International License (CC BY 4.0).
This license allows others to share, copy, distribute, and adapt the work, provided that proper credit is given to the original author(s) and the source.
Authors retain copyright and grant Interdisciplinary Journal of AI, Machine Learning & Data Science (IJAIMLDS) the right of first publication.
How to Cite
References
1. Ahmad, Z., Shahid Khan, A., Wai Shiang, C., Abdullah, J., & Ahmad, F. (2022). Network intrusion detection system: A systematic study of machine learning and deep learning approaches. Transactions on Emerging Telecommunications Technologies, 33(1), e4150. https://doi.org/10.1002/ett.4150 DOI: https://doi.org/10.1002/ett.4150
2. Buczak, A. L., & Guven, E. (2016). A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys & Tutorials, 18(2), 1153–1176. https://doi.org/10.1109/COMST.2015.2494502 DOI: https://doi.org/10.1109/COMST.2015.2494502
3. Ferrag, M. A., Maglaras, L., Moschoyiannis, S., & Janicke, H. (2020). Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. Journal of Information Security and Applications, 50, 102419. https://doi.org/10.1016/j.jisa.2019.102419 DOI: https://doi.org/10.1016/j.jisa.2019.102419
4. Ferrag, M. A., Shu, L., Yang, X., Derhab, A., & Maglaras, L. (2022). Security and privacy for green IoT-based agriculture: Review, blockchain solutions, and challenges. IEEE Access, 10, 12345–12368.
5. Kitchenham, B., & Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering. EBSE Technical Report, Keele University and Durham University.
6. Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & PRISMA Group. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Medicine, 6(7), e1000097. https://doi.org/10.1371/journal.pmed.1000097 DOI: https://doi.org/10.1371/journal.pmed.1000097
7. Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., Chou, R., Glanville, J., Grimshaw, J. M., Hróbjartsson, A., Lalu, M. M., Li, T., Loder, E. W., Mayo-Wilson, E., McDonald, S., ... Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372, n71. https://doi.org/10.1136/bmj.n71 DOI: https://doi.org/10.1136/bmj.n71
8. Shone, N., Ngoc, T. N., Phai, V. D., & Shi, Q. (2018). A deep learning approach to network intrusion detection. IEEE Transactions on Emerging Topics in Computational Intelligence, 2(1), 41–50. https://doi.org/10.1109/TETCI.2017.2772792 DOI: https://doi.org/10.1109/TETCI.2017.2772792
9. Snyder, H. (2019). Literature review as a research methodology: An overview and guidelines. Journal of Business Research, 104, 333–339. https://doi.org/10.1016/j.jbusres.2019.07.039 DOI: https://doi.org/10.1016/j.jbusres.2019.07.039
10. Vinayakumar, R., Alazab, M., Soman, K. P., Poornachandran, P., & Venkatraman, S. (2019). Deep learning approach for intelligent intrusion detection system. IEEE Access, 7, 41525–41550. https://doi.org/10.1109/ACCESS.2019.2895334 DOI: https://doi.org/10.1109/ACCESS.2019.2895334
11. Koroniotis, N., Moustafa, N., Sitnikova, E., & Turnbull, B. (2019). Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics. Future Generation Computer Systems, 100, 779–796. DOI: https://doi.org/10.1016/j.future.2019.05.041
12. Lippmann, R. P., Haines, J. W., Fried, D. J., Korba, J., & Das, K. (2000). The 1999 DARPA off-line intrusion detection evaluation. Computer Networks, 34(4–5), 579–595. DOI: https://doi.org/10.1016/S1389-1286(00)00139-0
13. Moustafa, N., & Slay, J. (2015). UNSW-NB15: A comprehensive data set for network intrusion detection systems. Military Communications and Information Systems Conference, 1–6. DOI: https://doi.org/10.1109/MilCIS.2015.7348942
14. Neto, E. C. P., et al. (2020). IoTID20: A novel intrusion detection dataset for IoT environments.
15. Sharafaldin, I., Lashkari, A. H., & Ghorbani, A. A. (2018). Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSP, 108–116. DOI: https://doi.org/10.5220/0006639801080116
16. Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. A. (2009). A detailed analysis of the KDD CUP 99 data set. CISDA, 1–6. DOI: https://doi.org/10.1109/CISDA.2009.5356528