Reducing False Negative Intrusions Rates of Ensemble Machine Learning Model based on Imbalanced Multiclass Datasets
Salim Q. MohammedDepartment of Communication Engineering, Technical College of Engineering, Sulaimani Polytechnic University salim.muhammed@spu.edu.iq0000-0003-3986-6701
Mohammed A. ElSheikh HusseinDepartment of Electrical Engineering, Faculty of Engineering Sciences, Sulaimani University mohammed.hussein@spu.edu.iq0000-0002-0423-7860
Keywords: False Negative (FN), Intrusion Detection Systems (IDS), Supervised Machine Learning, Multiclass Classification and Random Forest Classifier (RFC).
Abstract
In spite of the efforts to improve the efficiency of intrusion detection systems based on machine learning algorithms, these systems still need more. The false negative (FN) prediction outcome is of a major priority among other outcomes, when attacks are considered as normal by classifiers. FN outputs are highly a concern issue, especially for multiclass classification, where minor classes have less instances in imbalanced datasets. In this work, three types of well-known imbalanced multiclass classification datasets are used with ensemble machine learning classifiers. The datasets: KDD99, UNSW_NB15, and CICIDS2017 are balanced using different combination of oversampling and under-sampling techniques to improve false negative rates. Suitable performance metrics have been used to obtain significant outputs improvements in all three datasets types using Random Forest classifier. Achieved accuracies are 99.9852% for KDD99, 83.5451% for UNSW_NB15 and 99.8613% for CICIDS2017. The outcomes of the work using the mentioned datasets have been compared with state-of-the-art related works and the results show a clear improvement in false negative rates.