Self-admitted technical debt in R: detection and causes

被引:0
|
作者
Rishab Sharma
Ramin Shahbazi
Fatemeh H. Fard
Zadia Codabux
Melina Vidoni
机构
[1] University of British Columbia,Department of Computer Science
[2] University of Saskatchewan,Department of Computer Science
[3] Australian National University,CECS School of Computing
来源
关键词
Self-admitted technical debt; R packages; Machine learning; Deep learning; Deep neural pre-trained language models;
D O I
暂无
中图分类号
学科分类号
摘要
Self-Admitted Technical Debt (SATD) is primarily studied in Object-Oriented (OO) languages and traditionally commercial software. However, scientific software coded in dynamically-typed languages such as R differs in paradigm, and the source code comments’ semantics are different (i.e., more aligned with algorithms and statistics when compared to traditional software). Additionally, many Software Engineering topics are understudied in scientific software development, with SATD detection remaining a challenge for this domain. This gap adds complexity since prior works determined SATD in scientific software does not adjust to many of the keywords identified for OO SATD, possibly hindering its automated detection. Therefore, we investigated how classification models (traditional machine learning, deep neural networks, and deep neural Pre-Trained Language Models (PTMs)) automatically detect SATD in R packages. This study aims to study the capabilities of these models to classify different TD types in this domain and manually analyze the causes of each in a representative sample. Our results show that PTMs (i.e., RoBERTa) outperform other models and work well when the number of comments labelled as a particular SATD type has low occurrences. We also found that some SATD types are more challenging to detect. We manually identified sixteen causes, including eight new causes detected by our study. The most common cause was failure to remember, in agreement with previous studies. These findings will help the R package authors automatically identify SATD in their source code and improve their code quality. In the future, checklists for R developers can also be developed by scientific communities such as rOpenSci to guarantee a higher quality of packages before submission.
引用
收藏
相关论文
共 50 条
  • [1] Self-admitted technical debt in R: detection and causes
    Sharma, Rishab
    Shahbazi, Ramin
    Fard, Fatemeh H.
    Codabux, Zadia
    Vidoni, Melina
    AUTOMATED SOFTWARE ENGINEERING, 2022, 29 (02)
  • [2] A survey of self-admitted technical debt
    Sierra, Giancarlo
    Shihab, Emad
    Kamei, Yasutaka
    JOURNAL OF SYSTEMS AND SOFTWARE, 2019, 152 : 70 - 82
  • [3] Self-Admitted Technical Debt in R Packages: An Exploratory Study
    Vidoni, Melina
    2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021), 2021, : 179 - 189
  • [4] Data Balancing Improves Self-Admitted Technical Debt Detection
    Sridharan, Murali
    Mantyla, Mika
    Rantala, Leevi
    Claes, Maelick
    2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021), 2021, : 358 - 368
  • [5] On the documentation of self-admitted technical debt in issues
    Xavier, Laerte
    Montandon, Joao Eduardo
    Ferreira, Fabio
    Brito, Rodrigo
    Valente, Marco Tulio
    EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (07)
  • [6] An Exploratory Study on Self-Admitted Technical Debt
    Potdar, Aniket
    Shihab, Emad
    2014 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME), 2014, : 91 - 100
  • [7] On the documentation of self-admitted technical debt in issues
    Laerte Xavier
    João Eduardo Montandon
    Fabio Ferreira
    Rodrigo Brito
    Marco Tulio Valente
    Empirical Software Engineering, 2022, 27
  • [8] Self-Admitted Technical Debt in Commit Messages: Comparing Java, Python, and R
    Codabux, Zadia
    Vidoni, Melina
    Fard, Fatemeh H.
    SSRN, 2022,
  • [9] On the Relationship between Self-Admitted Technical Debt Removals and Technical Debt Measures
    Aversano, Lerina
    Iammarino, Martina
    Carapella, Mimmo
    Del Vecchio, Andrea
    Nardi, Laura
    ALGORITHMS, 2020, 13 (07)
  • [10] An empirical study on self-admitted technical debt in Dockerfiles
    Azuma, Hideaki
    Matsumoto, Shinsuke
    Kamei, Yasutaka
    Kusumoto, Shinji
    EMPIRICAL SOFTWARE ENGINEERING, 2022, 27 (02)