The Impact of Classifier Configuration and Classifier Combination on Bug Localization

被引：74

作者：

Thomas, Stephen W. ^{[1
]}

Nagappan, Meiyappan ^{[1
]}

Blostein, Dorothea ^{[1
]}

Hassan, Ahmed E. ^{[1
]}

机构：

[1] Queens Univ, Sch Comp, Kingston, ON K7K 2N8, Canada

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2013年 / 39卷 / 10期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Software maintenance; bug localization; information retrieval; VSM; LSI; LDA; classifier combination; LOCATION;

D O I：

10.1109/TSE.2013.27

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Bug localization is the task of determining which source code entities are relevant to a bug report. Manual bug localization is labor intensive since developers must consider thousands of source code entities. Current research builds bug localization classifiers, based on information retrieval models, to locate entities that are textually similar to the bug report. Current research, however, does not consider the effect of classifier configuration, i.e., all the parameter values that specify the behavior of a classifier. As such, the effect of each parameter or which parameter values lead to the best performance is unknown. In this paper, we empirically investigate the effectiveness of a large space of classifier configurations, 3,172 in total. Further, we introduce a framework for combining the results of multiple classifier configurations since classifier combination has shown promise in other domains. Through a detailed case study on over 8,000 bug reports from three large-scale projects, we make two main contributions. First, we show that the parameters of a classifier have a significant impact on its performance. Second, we show that combining multiple classifiers-whether those classifiers are hand-picked or randomly chosen relative to intelligently defined subspaces of classifiers-improves the performance of even the best individual classifiers.

引用

页码：1427 / 1443

页数：17

共 50 条

[21] Evaluating classifier combination in object classification
Hou, Jian
Xu, E.
Xia, Qi
Qi, Nai-Ming
PATTERN ANALYSIS AND APPLICATIONS, 2015, 18 (04) : 799 - 816
[22] A Bayesian framework for the combination of classifier outputs
Zhu, H
Beling, PA
Overstreet, GA
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2002, 53 (07) : 719 - 727
[23] A 'No Panacea Theorem' for multiple classifier combination
Hu, Roland
Damper, R. I.
18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS, 2006, : 1250 - +
[24] Evaluating classifier combination in object classification
Jian Hou
Xu E
Qi Xia
Nai-Ming Qi
Pattern Analysis and Applications, 2015, 18 : 799 - 816
[25] Classifier combination based on confidence transformation
Liu, CL
PATTERN RECOGNITION, 2005, 38 (01) : 11 - 28
[26] DECISION COMBINATION IN MULTIPLE CLASSIFIER SYSTEMS
HO, TK
HULL, JJ
SRIHARI, SN
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1994, 16 (01) : 66 - 75
[27] A Boundary based Classifier Combination Method
Liu, Ming
Li, Kunlun
Zhao, Rui
CCDC 2009: 21ST CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-6, PROCEEDINGS, 2009, : 3777 - 3782
[28] Classifier combination based on active learning
Yi, X
Kou, ZB
Zhang, CS
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, 2004, : 184 - 187
[29] Data fusion by intelligent classifier combination
Buxton, BF
Langdon, WB
Barrett, SJ
MEASUREMENT & CONTROL, 2001, 34 (08): : 229 - 234
[30] Classifier combination for vehicle silhouettes recognition
Prampero, PS
de Carvalho, ACPLF
SEVENTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND ITS APPLICATIONS, 1999, (465): : 67 - 71

← 1 2 3 4 5 →