Automatic detection of Long Method and God Class code smells through neural source code embeddings

被引:29
|
作者
Kovacevic, Aleksandar [1 ]
Slivka, Jelena [1 ]
Vidakovic, Dragan [1 ]
Grujic, Katarina-Glorija [1 ]
Luburic, Nikola [1 ]
Prokic, Simona [1 ]
Sladic, Goran [1 ]
机构
[1] Univ Novi Sad, Fac Tech Sci, Trg Dositeja Obradovica 6, Novi Sad 21000, Serbia
关键词
Code smell detection; Neural source code embeddings; Code metrics; Machine learning; Software engineering; IMPACT;
D O I
10.1016/j.eswa.2022.117607
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Code smells are structures in code that often harm its quality. Manually detecting code smells is challenging, so researchers proposed many automatic detectors. Traditional code smell detectors employ metric-based heuristics, but researchers have recently adopted a Machine-Learning (ML) based approach. This paper compares the performance of multiple ML-based code smell detection models against multiple metric-based heuristics for detection of God Class and Long Method code smells. We assess the effectiveness of different source code representations for ML: we evaluate the effectiveness of traditionally used code metrics against code embeddings (code2vec, code2seq, and CuBERT). This study is the first to evaluate the effectiveness of pre-trained neural source code embeddings for code smell detection to the best of our knowledge. This approach helped us leverage the power of transfer learning - our study is the first to explore whether the knowledge mined from code understanding models can be transferred to code smell detection. A secondary contribution of our research is the systematic evaluation of the effectiveness of code smell detection approaches on the same large-scale, manually labeled MLCQ dataset. Almost every study that proposes a detection approach tests this approach on the dataset unique for the study. Consequently, we cannot directly compare the reported performances to derive the bestperforming approach.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Data generation and annotation method for source code defect detection
    Guan, Zhibin
    Wang, Xiaomeng
    Xin, Wei
    Wang, Jiajie
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2021, 61 (11): : 1240 - 1245
  • [32] Source code defect detection using deep convolutional neural networks
    Wang, Xiaomeng
    Guan, Zhibin
    Xin, Wei
    Wang, Jiajie
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2021, 61 (11): : 1267 - 1272
  • [33] MANDO-GURU: Vulnerability Detection for Smart Contract Source Code by Heterogeneous Graph Embeddings
    Nguyen, Hoang H.
    Nhat-Minh Nguyen
    Hong-Phuc Doan
    Ahmadi, Zahra
    Thanh-Nam Doan
    Jiang, Lingxiao
    PROCEEDINGS OF THE 30TH ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2022, 2022, : 1736 - 1740
  • [34] Formal verification of software source code through semi-automatic modeling
    Eisner C.
    Software & Systems Modeling, 2005, 4 (1) : 14 - 31
  • [35] CODE-SMASH: Source-Code Vulnerability Detection Using Siamese and Multi-Level Neural Architecture
    Han, Sungmin
    Nam, Hyunkyung
    Kang, Jaesik
    Kim, Kwangsoo
    Cho, Seungjae
    Lee, Sangkyun
    IEEE ACCESS, 2024, 12 : 102492 - 102504
  • [36] FedREVAN: Real-time DEtection of Vulnerable Android Source Code Through Federated Neural Network with XAI
    Senanayake, Janaka
    Kalutarage, Harsha
    Petrovski, Andrei
    Al-Kadri, Mhd Omar
    Piras, Luca
    COMPUTER SECURITY. ESORICS 2023 INTERNATIONAL WORKSHOPS, CPS4CIP, PT II, 2024, 14399 : 426 - 441
  • [37] A MICROSERVICE DECOMPOSITION METHOD THROUGH USING DISTRIBUTED REPRESENTATION OF SOURCE CODE
    Al-Debagy, Omar
    Martinek, Peter
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2021, 22 (01): : 39 - 52
  • [38] A Microservice Decomposition Method Through Using Distributed Representation Of Source Code
    Al-Debagy O.
    Martinek P.
    Scalable Computing, 2021, 22 (01): : 39 - 52
  • [39] A Source Code Cross-site Scripting Vulnerability Detection Method
    Chen, Mu
    Chen, Lu
    Shao, Zhipeng
    Dai, Zaojian
    Li, Nige
    Huang, Xingjie
    Dang, Qian
    Zhao, Xinjian
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2023, 17 (06): : 1689 - 1705
  • [40] Feature dependence graph based source code loophole detection method
    Yang H.
    Yang H.
    Zhang L.
    Cheng X.
    Tongxin Xuebao/Journal on Communications, 2023, 44 (01): : 103 - 117