Statistical significance testing - a panacea for software technology experiments?

被引：24

作者：

Miller, J ^{[1
]}

机构：

[1] Univ Alberta, Dept Elect & Comp Engn, STEAM Res Ctr, Edmonton, AB T6H 5M3, Canada

来源：

JOURNAL OF SYSTEMS AND SOFTWARE | 2004年 / 73卷 / 02期

关键词：

empirical; hypothesis; replication;

D O I：

10.1016/j.jss.2003.12.019

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Empirical software engineering has a long history of utilizing statistical significance testing, and in many ways, it has become the backbone of the topic. What is less obvious is how much consideration has been given to its adoption. Statistical significance testing was initially designed for testing hypotheses in a very different area, and hence the question must be asked: does it transfer into empirical software engineering research? This paper attempts to address this question. The paper finds that this transference is far from straightforward, resulting in several problems in its deployment within the area. Principally problems exist in: formulating hypotheses, the calculation of the probability values and its associated cut-off value, and the construction of the sample and its distribution. Hence, the paper concludes that the topic should explore other avenues of analysis, in an attempt to establish which analysis approaches are preferable under which conditions, when conducting empirical software engineering studies. (C) 2003 Elsevier Inc. All rights reserved.

引用

页码：183 / 192

页数：10

共 50 条

[21] Statistical Significance Testing in Theory and in Practice
Carterette, Ben
PROCEEDINGS OF THE 2019 ACM SIGIR INTERNATIONAL CONFERENCE ON THEORY OF INFORMATION RETRIEVAL (ICTIR'19), 2019, : 256 - 258
[22] Statistical Significance Testing and Clinical Trials
Krause, Merton S.
PSYCHOTHERAPY, 2011, 48 (03) : 217 - 222
[23] Statistical significance testing in theory and in practice
Carterette, Ben
ICTIR 2019 - Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, 2019, : 257 - 259
[24] WHAT STATISTICAL SIGNIFICANCE TESTING IS, AND WHAT IT IS NOT
SHAVER, JP
JOURNAL OF EXPERIMENTAL EDUCATION, 1993, 61 (04): : 293 - 316
[25] CASE AGAINST STATISTICAL SIGNIFICANCE TESTING
CARVER, RP
HARVARD EDUCATIONAL REVIEW, 1978, 48 (03) : 378 - 399
[26] Testing statistical significance testing: Some observations of an agnostic
Stewart, DW
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 2000, 60 (05) : 685 - 690
[27] Increasing the Statistical Significance of Entanglement Detection in Experiments
Jungnitsch, Bastian
Niekamp, Soenke
Kleinmann, Matthias
Guehne, Otfried
Lu, He
Gao, Wei-Bo
Chen, Yu-Ao
Chen, Zeng-Bing
Pan, Jian-Wei
PHYSICAL REVIEW LETTERS, 2010, 104 (21)
[28] Panel Testing Is Not a Panacea
Axilbund, Jennifer E.
JOURNAL OF CLINICAL ONCOLOGY, 2016, 34 (13) : 1433 - +
[29] An investigation of the applicability of design of experiments to software testing
Kuhn, R
Reilly, MJ
27TH ANNUAL NASA GODDARD/IEEE SOFTWARE ENGINEERING WORKSHOP - PROCEEDINGS, 2003, : 91 - 95
[30] Combinatorial testing for software: An adaptation of design of experiments
Kacker, Raghu N.
Kuhn, D. Richard
Lei, Yu
Lawrence, James F.
MEASUREMENT, 2013, 46 (09) : 3745 - 3752

← 1 2 3 4 5 →