Feature Selection For Machine Learning-Based Early Detection of Distributed Cyber Attacks

被引:19
|
作者
Feng, Yaokai [1 ]
Akiyama, Hitoshi [2 ]
Lu, Liang [2 ,4 ]
Sakurai, Kouichi [3 ]
机构
[1] Kyushu Univ, Fac Adv Informat Technol, Fukuoka, Fukuoka, Japan
[2] Kyushu Univ, Dept Informat, Fukuoka, Fukuoka, Japan
[3] Kyushu Univ, Fac Informat, Fukuoka, Fukuoka, Japan
[4] Fujitsu Co Ltd, Fukuoka, Fukuoka, Japan
基金
日本科学技术振兴机构;
关键词
distributed cyber attacks; DDoS attacks; machine learning; feature selection; early detection; CLASSIFICATION;
D O I
10.1109/DASC/PiCom/DataCom/CyberSciTec.2018.00040
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is well known that distributed cyber attacks simultaneously launched from many hosts have caused the most serious problems in recent years including problems of privacy leakage and denial of services. Thus, how to detect those attacks at early stage has become an important and urgent topic in the cyber security community. For this purpose, recognizing C&C (Command & Control) communication between compromised bots and the C&C server becomes a crucially important issue, because C&C communication is in the preparation phase of distributed attacks. Although attack detection based on signature has been practically applied since long ago, it is well-known that it cannot efficiently deal with new kinds of attacks. In recent years, ML(Machine learning)-based detection methods have been studied widely. In those methods, feature selection is obviously very important to the detection performance. We once utilized up to 55 features to pick out C&C traffic in order to accomplish early detection of DDoS attacks. In this work, we try to answer the question that "Are all of those features really necessary?" We mainly investigate how the detection performance moves as the features are removed from those having lowest importance and we try to make it clear that what features should be payed attention for early detection of distributed attacks. We use honeypot data collected during the period from 2008 to 2013. SVM(Support Vector Machine) and PCA(Principal Component Analysis) are utilized for feature selection and SVM and RF(Random Forest) are for building the classifier. We find that the detection performance is generally getting better if more features are utilized. However, after the number of features has reached around 40, the detection performance will not change much even more features are used. It is also verified that, in some specific cases, more features do not always means a better detection performance. We also discuss 10 important features which have the biggest influence on classification.
引用
收藏
页码:173 / 180
页数:8
相关论文
共 50 条
  • [31] An Adversarial Machine Learning-Based Fast Detection Method for Denial of Service-Oriented Cyber Attacks in Internet of Vehicles
    Wang, Mingxu
    Xu, Mingchen
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2024, 33 (07)
  • [32] A Feature Ranking and Selection Algorithm for Machine Learning-Based Step Counters
    Vandermeeren, Stef
    Van de Velde, Samuel
    Bruneel, Herwig
    Steendam, Heidi
    IEEE SENSORS JOURNAL, 2018, 18 (08) : 3255 - 3265
  • [33] Numerical Feature Selection and Hyperbolic Tangent Feature Scaling in Machine Learning-Based Detection of Anomalies in the Computer Network Behavior
    Protic, Danijela
    Stankovic, Miomir
    Prodanovic, Radomir
    Vulic, Ivan
    Stojanovic, Goran M.
    Simic, Mitar
    Ostojic, Gordana
    Stankovski, Stevan
    ELECTRONICS, 2023, 12 (19)
  • [34] Variance Fractal Dimension Feature Selection for Detection of Cyber Security Attacks
    Kaiser, Samilat
    Ferens, Ken
    ADVANCES IN ARTIFICIAL INTELLIGENCE AND APPLIED COGNITIVE COMPUTING, 2021, : 1029 - 1045
  • [35] Enhancing intrusion detection in IoT networks using machine learning-based feature selection and ensemble models
    Almotairi, Ayoob
    Atawneh, Samer
    Khashan, Osama A.
    Khafajah, Nour M.
    SYSTEMS SCIENCE & CONTROL ENGINEERING, 2024, 12 (01)
  • [36] Feature Selection Approach for Phishing Detection Based on Machine Learning
    Wei, Yi
    Sekiya, Yuji
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON APPLIED CYBER SECURITY (ACS) 2021, 2022, 378 : 61 - 70
  • [37] Phishing detection based on machine learning and feature selection methods
    Almseidin M.
    Abu Zuraiq A.M.
    Al-kasassbeh M.
    Alnidami N.
    International Journal of Interactive Mobile Technologies, 2019, 13 (12) : 71 - 183
  • [38] Machine Learning-Based Methodologies for Cyber-Attacks and Network Traffic Monitoring: A Review and Insights
    Genuario, Filippo
    Santoro, Giuseppe
    Giliberti, Michele
    Bello, Stefania
    Zazzera, Elvira
    Impedovo, Donato
    INFORMATION, 2024, 15 (11)
  • [39] Internet of Things Cyber Attacks Detection using Machine Learning
    Alsamiri, Jadel
    Alsubhi, Khalid
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (12) : 627 - 634
  • [40] Detection of IoT Botnet Cyber Attacks Using Machine Learning
    Khaleefah A.D.
    Al-Mashhadi H.M.
    Informatica (Slovenia), 2023, 47 (06): : 55 - 64