Automatic detection of Long Method and God Class code smells through neural source code embeddings

被引:29
|
作者
Kovacevic, Aleksandar [1 ]
Slivka, Jelena [1 ]
Vidakovic, Dragan [1 ]
Grujic, Katarina-Glorija [1 ]
Luburic, Nikola [1 ]
Prokic, Simona [1 ]
Sladic, Goran [1 ]
机构
[1] Univ Novi Sad, Fac Tech Sci, Trg Dositeja Obradovica 6, Novi Sad 21000, Serbia
关键词
Code smell detection; Neural source code embeddings; Code metrics; Machine learning; Software engineering; IMPACT;
D O I
10.1016/j.eswa.2022.117607
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Code smells are structures in code that often harm its quality. Manually detecting code smells is challenging, so researchers proposed many automatic detectors. Traditional code smell detectors employ metric-based heuristics, but researchers have recently adopted a Machine-Learning (ML) based approach. This paper compares the performance of multiple ML-based code smell detection models against multiple metric-based heuristics for detection of God Class and Long Method code smells. We assess the effectiveness of different source code representations for ML: we evaluate the effectiveness of traditionally used code metrics against code embeddings (code2vec, code2seq, and CuBERT). This study is the first to evaluate the effectiveness of pre-trained neural source code embeddings for code smell detection to the best of our knowledge. This approach helped us leverage the power of transfer learning - our study is the first to explore whether the knowledge mined from code understanding models can be transferred to code smell detection. A secondary contribution of our research is the systematic evaluation of the effectiveness of code smell detection approaches on the same large-scale, manually labeled MLCQ dataset. Almost every study that proposes a detection approach tests this approach on the dataset unique for the study. Consequently, we cannot directly compare the reported performances to derive the bestperforming approach.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] A General Source Code Vulnerability Detection Method via Ensemble of Graph Neural Networks
    Zeng, Ciling
    Zhou, Bo
    Dong, Huoyuan
    Wu, Haolin
    Xie, Peiyuan
    Guan, Zhitao
    FRONTIERS IN CYBER SECURITY, FCS 2023, 2024, 1992 : 560 - 574
  • [22] Refactoring Opportunity Identification Methodology for Removing Long Method Smells and Improving Code Analyzability
    Meananeatra, Panita
    Rongviriyapanish, Songsakdi
    Apiwattanapong, Taweesup
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (07): : 1766 - 1779
  • [23] SOURCE CODE PLAGIARISM DETECTION METHOD USING ONTOLOGIES
    Smeureanu, Ion
    Iancu, Bogdan
    INTERNATIONAL CONFERENCE ON INFORMATICS IN ECONOMY, 2013, : 594 - 597
  • [24] Improving API Usage through Automatic Detection of Redundant Code
    Kawrykow, David
    Robillard, Martin P.
    2009 IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, PROCEEDINGS, 2009, : 111 - 122
  • [25] Graph neural network-based long method and blob code smell detection
    Zhang, Minnan
    Jia, Jingdong
    Capretz, Luiz Fernando
    Hou, Xin
    Tan, Huobin
    ADVANCES IN COLLOID AND INTERFACE SCIENCE, 2025, 340
  • [26] A NEW SIMILARITY MEASURE FOR IN-CLASS SOURCE CODE PLAGIARISM DETECTION
    Ohno, Asako
    Murao, Hajime
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2009, 5 (11B): : 4237 - 4247
  • [27] Survey on Neural Network-based Automatic Source Code Summarization Technologies
    Song X.-T.
    Sun H.-L.
    Ruan Jian Xue Bao/Journal of Software, 2022, 33 (01): : 55 - 77
  • [28] DynAMICS: A Tool-Based Method for the Specification and Dynamic Detection of Android Behavioral Code Smells
    Prestat, Dimitri
    Moha, Naouel
    Villemaire, Roger
    Avellaneda, Florent
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2024, 50 (04) : 765 - 784
  • [29] An Evaluation of Multi-Label Classification Approaches for Method-Level Code Smells Detection
    Yadav, Pravin Singh
    Rao, Rajwant Singh
    Mishra, Alok
    IEEE ACCESS, 2024, 12 : 53664 - 53676
  • [30] A Privacy-Preserving Source Code Vulnerability Detection Method
    Zhao, Dongdong
    Yu, Zizhuo
    Zhou, Jing
    Xiang, Jianwen
    PATTERN RECOGNITION AND COMPUTER VISION, PT III, PRCV 2024, 2025, 15033 : 438 - 452