Detection of phishing websites using an efficient feature-based machine learning framework

被引:117
|
作者
Rao, Routhu Srinivasa [1 ]
Pais, Alwyn Roshan [1 ]
机构
[1] Natl Inst Technol Karnataka, Informat Secur Res Lab, Surathkal, India
来源
NEURAL COMPUTING & APPLICATIONS | 2019年 / 31卷 / 08期
关键词
Cyber-attack; Phishing; Anti-phishing; Heuristic technique; Machine learning algorithms; Random Forest; Oblique Random Forest; CLASSIFICATION; ENSEMBLE; MODEL;
D O I
10.1007/s00521-017-3305-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Phishing is a cyber-attack which targets naive online users tricking into revealing sensitive information such as username, password, social security number or credit card number etc. Attackers fool the Internet users by masking webpage as a trustworthy or legitimate page to retrieve personal information. There are many anti-phishing solutions such as blacklist or whitelist, heuristic and visual similarity-based methods proposed to date, but online users are still getting trapped into revealing sensitive information in phishing websites. In this paper, we propose a novel classification model, based on heuristic features that are extracted from URL, source code, and third-party services to overcome the disadvantages of existing anti-phishing techniques. Our model has been evaluated using eight different machine learning algorithms and out of which, the Random Forest (RF) algorithm performed the best with an accuracy of 99.31%. The experiments were repeated with different (orthogonal and oblique) random forest classifiers to find the best classifier for the phishing website detection. Principal component analysis Random Forest (PCA-RF) performed the best out of all oblique Random Forests (oRFs) with an accuracy of 99.55%. We have also tested our model with the third-party-based features and without third-party-based features to determine the effectiveness of third-party services in the classification of suspicious websites. We also compared our results with the baseline models (CANTINA and CANTINA+). Our proposed technique outperformed these methods and also detected zero-day phishing attacks.
引用
收藏
页码:3851 / 3873
页数:23
相关论文
共 50 条
  • [41] A Feature-Based Approach for Sentiment Quantification Using Machine Learning
    Ayyub, Kashif
    Iqbal, Saqib
    Nisar, Muhammad Wasif
    Munir, Ehsan Ullah
    Alarfaj, Fawaz Khaled
    Almusallam, Naif
    ELECTRONICS, 2022, 11 (06)
  • [42] Mushroom Classification Using Feature-Based Machine Learning Approach
    Maurya, Pranjal
    Singh, Nagendra Pratap
    PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON COMPUTER VISION AND IMAGE PROCESSING, CVIP 2018, VOL 1, 2020, 1022 : 197 - 206
  • [43] A Survey of Feature Selection for Vulnerability Prediction Using Feature-based Machine Learning
    Li, ZhanJun
    Shao, Yan
    ICMLC 2019: 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, 2019, : 30 - 36
  • [44] Enhanced Feature Selection Using Genetic Algorithm for Machine-Learning-Based Phishing URL Detection
    Kocyigit, Emre
    Korkmaz, Mehmet
    Sahingoz, Ozgur Koray
    Diri, Banu
    APPLIED SCIENCES-BASEL, 2024, 14 (14):
  • [45] Phishing attack detection using Machine Learning
    Pandiyan S S.
    Selvaraj P.
    Burugari V.K.
    Benadit P J.
    P K.
    Measurement: Sensors, 2022, 24
  • [46] Phishing and Smishing Detection Using Machine Learning
    El Karhani, Hadi
    Al Jamal, Riad
    Samra, Yorgo Bou
    Elhajj, Imad H.
    Kayssi, Ayman
    2023 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE, CSR, 2023, : 206 - 211
  • [47] Intrusion detection based on phishing detection with machine learning
    Jayaraj R.
    Pushpalatha A.
    Sangeetha K.
    Kamaleshwar T.
    Udhaya Shree S.
    Damodaran D.
    Measurement: Sensors, 2024, 31
  • [48] A Novel Machine Learning Approach to Detect Phishing Websites
    Tyagi, Ishant
    Shad, Jatin
    Sharma, Shubham
    Gaur, Siddharth
    Kaur, Gagandeep
    2018 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND INTEGRATED NETWORKS (SPIN), 2018, : 425 - 430
  • [49] HTTP header based phishing attack detection using machine learning
    Shukla, Sanjeev
    Misra, Manoj
    Varshney, Gaurav
    TRANSACTIONS ON EMERGING TELECOMMUNICATIONS TECHNOLOGIES, 2024, 35 (01)
  • [50] Machine Learning Based Phishing Attacks Detection Using Multiple Datasets
    Aljammal, Ashraf H.
    taamneh, Salah
    Qawasmeh, Ahmad
    Salameh, Hani Bani
    International Journal of Interactive Mobile Technologies, 2023, 17 (05): : 71 - 83