The Impact of Classifier Configuration and Classifier Combination on Bug Localization

被引:74
|
作者
Thomas, Stephen W. [1 ]
Nagappan, Meiyappan [1 ]
Blostein, Dorothea [1 ]
Hassan, Ahmed E. [1 ]
机构
[1] Queens Univ, Sch Comp, Kingston, ON K7K 2N8, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Software maintenance; bug localization; information retrieval; VSM; LSI; LDA; classifier combination; LOCATION;
D O I
10.1109/TSE.2013.27
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Bug localization is the task of determining which source code entities are relevant to a bug report. Manual bug localization is labor intensive since developers must consider thousands of source code entities. Current research builds bug localization classifiers, based on information retrieval models, to locate entities that are textually similar to the bug report. Current research, however, does not consider the effect of classifier configuration, i.e., all the parameter values that specify the behavior of a classifier. As such, the effect of each parameter or which parameter values lead to the best performance is unknown. In this paper, we empirically investigate the effectiveness of a large space of classifier configurations, 3,172 in total. Further, we introduce a framework for combining the results of multiple classifier configurations since classifier combination has shown promise in other domains. Through a detailed case study on over 8,000 bug reports from three large-scale projects, we make two main contributions. First, we show that the parameters of a classifier have a significant impact on its performance. Second, we show that combining multiple classifiers-whether those classifiers are hand-picked or randomly chosen relative to intelligently defined subspaces of classifiers-improves the performance of even the best individual classifiers.
引用
收藏
页码:1427 / 1443
页数:17
相关论文
共 50 条
  • [1] The impact of IR-based classifier configuration on the performance and the effort of method-level bug localization
    Tantithamthavorn, Chakkrit
    Abebe, Surafel Lemma
    Hassan, Ahmed E.
    Ihara, Akinori
    Matsumoto, Kenichi
    INFORMATION AND SOFTWARE TECHNOLOGY, 2018, 102 : 160 - 174
  • [2] Classifier combination for face localization in color images
    Belaroussi, R
    Prevost, L
    Milgram, M
    IMAGE ANALYSIS AND PROCESSING - ICIAP 2005, PROCEEDINGS, 2005, 3617 : 1043 - 1050
  • [3] CCC: Classifier Combination via Classifier
    Lu, Can-Yi
    ADVANCED INTELLIGENT COMPUTING, 2011, 6838 : 100 - 107
  • [4] A combination fingerprint classifier
    Senior, A
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (10) : 1165 - 1174
  • [5] Kernel combination versus classifier combination
    Lee, Wan-Jui
    Verzakov, Sergey
    Duin, Robert P. W.
    MULTIPLE CLASSIFIER SYSTEMS, PROCEEDINGS, 2007, 4472 : 22 - +
  • [6] A 'No Panacea Theorem' for classifier combination
    Hu, Roland
    Damper, R. I.
    PATTERN RECOGNITION, 2008, 41 (08) : 2665 - 2673
  • [7] Classifier Combination with Kernelized EigenClassifiers
    Ekmekci, Umit
    Cataltepe, Zehra
    2013 16TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2013, : 743 - 749
  • [8] Classifier combination in speech recognition
    Felföldi, László
    Kocsor, András
    Tóth, Lászlo
    Periodica Polytechnica Electrical Engineering, 2003, 47 (1-2): : 125 - 140
  • [9] On the effect of calibration in classifier combination
    Antonio Bella
    Cèsar Ferri
    José Hernández-Orallo
    María José Ramírez-Quintana
    Applied Intelligence, 2013, 38 : 566 - 585
  • [10] Study of a classifier combination scheme
    Li, CH
    Yang, B
    Xie, WX
    CHINESE JOURNAL OF ELECTRONICS, 2002, 11 (04): : 542 - 545