On the relationship between bug reports and queries for text retrieval-based bug localization

被引:0
|
作者
Chris Mills
Esteban Parra
Jevgenija Pantiuchina
Gabriele Bavota
Sonia Haiduc
机构
[1] Florida State University,
[2] Università della Svizzera italiana,undefined
来源
关键词
Bug localization; Query formulation; Text retrieval;
D O I
暂无
中图分类号
学科分类号
摘要
As societal dependence on software continues to grow, bugs are becoming increasingly costly in terms of financial resources as well as human safety. Bug localization is the process by which a developer identifies buggy code that needs to be fixed to make a system safer and more reliable. Unfortunately, manually attempting to locate bugs solely from the information in a bug report requires advanced knowledge of how a system is constructed and the way its constituent pieces interact. Therefore, previous work has investigated numerous techniques for reducing the human effort spent in bug localization. One of the most common approaches is Text Retrieval (TR) in which a system’s source code is indexed into a search space that is then queried for code relevant to a given bug report. In the last decade, dozens of papers have proposed improvements to bug localization using TR with largely positive results. However, several other studies have called the technique into question. According to these studies, evaluations of TR-based approaches often lack sufficient controls on biases that artificially inflate the results, namely: misclassified bugs, tangled commits, and localization hints. Here we argue that contemporary evaluations of TR approaches also include a negative bias that outweighs the previously identified positive biases: while TR approaches expect a natural language query, most evaluations simply formulate this query as the full text of a bug report. In this study we show that highly performing queries can be extracted from the bug report text, in order to make TR effective even without the aforementioned positive biases. Further, we analyze the provenance of terms in these highly performing queries to drive future work in automatic query extraction from bug reports.
引用
收藏
页码:3086 / 3127
页数:41
相关论文
共 50 条
  • [41] Review of Text Mining Techniques for Software Bug Localization
    Tamanna
    Sangwan, Om Prakash
    2019 9TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2019), 2019, : 208 - 211
  • [42] The forgotten role of search queries in IR-based bug localization: an empirical study
    Rahman, Mohammad Masudur
    Khomh, Foutse
    Yeasmin, Shamima
    Roy, Chanchal K.
    EMPIRICAL SOFTWARE ENGINEERING, 2021, 26 (06)
  • [43] Automatic Keyword and Sentence-Based Text Summarization for Software Bug Reports
    Jindal, Shubhra Goyal
    Kaur, Arvinder
    IEEE ACCESS, 2020, 8 : 65352 - 65370
  • [44] Improving Bug Localization using Correlations in Crash Reports
    Wang, Shaohua
    Khomh, Foutse
    Zou, Ying
    2013 10TH IEEE WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR), 2013, : 247 - 256
  • [45] A Similarity Integration Method based Information Retrieval and Word Embedding in Bug Localization
    Cheng, Shasha
    Yan, Xuefeng
    Khan, Arif Ali
    2020 IEEE 20TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY, AND SECURITY (QRS 2020), 2020, : 180 - 187
  • [46] Bug Reports Retrieval Using Self-Organizing Map
    do Rego, Renata L. M. E.
    Ribeiro, Marcio
    Aleixo, Emanuella
    de Souza, Renata M. C. R.
    2008 THIRD INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT, VOLS 1 AND 2, 2008, : 327 - 332
  • [47] Automatic Duplicate Bug Report Detection using Information Retrieval-based versus Machine Learning-based Approaches
    Neysiani, Behzad Soleimani
    Babamir, Seyed Morteza
    2020 6TH INTERNATIONAL CONFERENCE ON WEB RESEARCH (ICWR), 2020, : 288 - 293
  • [48] Is This Bug Severe? A Text-Cum-Graph Based Model for Bug Severity Prediction
    Hazra, Rima
    Dwivedi, Arpit
    Mukherjee, Animesh
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT VI, 2023, 13718 : 236 - 252
  • [49] Improving Bug Localization by Mining Crash Reports: An Industrial Study
    Medeiros, Marcos
    Kulesza, Uira
    Bonifacio, Rodrigo
    Adachi, Eiji
    Coelho, Roberta
    2020 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2020), 2020, : 766 - 775
  • [50] How Does Execution Information Help with Information-Retrieval Based Bug Localization?
    Dao, Tung
    Zhang, Lingming
    Meng, Na
    2017 IEEE/ACM 25TH INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC), 2017, : 241 - 250