AI as a Sport: On the Competitive Epistemologies of Benchmarking

被引：2

作者：

Orr, Will ^{[1
]}

Kang, Edward B. ^{[2
]}

机构：

[1] Univ Southern Calif, Los Angeles, CA 90007 USA

[2] NYU, New York, NY USA

来源：

PROCEEDINGS OF THE 2024 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, ACM FACCT 2024 | 2024年

关键词：

Machine learning benchmarks; Machine learning competitions; History of benchmarking; Benchmarking for generative AI; Benchmark datasets;

D O I：

10.1145/3630106.3659012

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Artificial Intelligence (AI) systems are evaluated using competitive methods that rely on benchmark datasets to determine performance. These benchmark datasets, however, are often constructed through arbitrary processes that fall short in encapsulating the depth and breadth of the tasks they are intended to measure. In this paper, we interrogate the naturalization of benchmark datasets as veracious metrics by examining the historical development of benchmarking as an epistemic practice in AI research. Specifically, we highlight three key case studies that were crucial in establishing the existing reliance on benchmark datasets for evaluating the capabilities of AI systems: (1) the sharing of Highleyman's OCR dataset in the 1960s, which solidified a community of knowledge production around a shared benchmark dataset, (2) the Common Task Framework (CTF) of the 1980s, a state-led project to standardize benchmark datasets as legitimate indicators of technical progress; and (3) the Netflix Prize which further solidified benchmarking as a competitive goal within the ML research community. This genealogy highlights how contemporary dynamics and limitations of benchmarking developed from a longer history of collaboration, standardization, and competition. We end with reflections on how this history informs our understanding of benchmarking in the current era of generative artificial intelligence.

引用

页码：1875 / 1884

页数：10

共 50 条

[1] AI and the Epistemologies of the South
Santos, Boaventura de Sousa
JOURNAL OF WORLD-SYSTEMS RESEARCH, 2024, 30 (02) : 635 - 645
[2] Sport and AI
Edgar, Andrew
SPORT ETHICS AND PHILOSOPHY, 2023, 17 (03) : 275 - 277
[3] Benchmarking of elite sport systems
Boehlke, Nikolai
Robinson, Leigh
MANAGEMENT DECISION, 2009, 47 (01) : 67 - 84
[4] Fairness beyond competitive sport-compared with competitive sport An assessment
Schurmann, Volker
GERMAN JOURNAL OF EXERCISE AND SPORT RESEARCH, 2023, 53 (03) : 333 - 343
[5] Benchmarking Competitive Intelligence Activity
Rothberg, Helen N.
Erickson, G. Scott
JOURNAL OF INTELLIGENCE STUDIES IN BUSINESS, 2012, 2 (03): : 5 - 11
[6] Benchmarking Competitive Intelligence Activity
Rothberg, Helen
Erickson, Scott
PROCEEDINGS OF THE 13TH EUROPEAN CONFERENCE ON KNOWLEDGE MANAGEMENT, VOLS 1 AND 2, 2012, : 1026 - 1032
[7] SAIBench: Benchmarking AI for Science
Li Y.
Zhan J.
BenchCouncil Transactions on Benchmarks, Standards and Evaluations, 2022, 2 (02):
[8] CHILDREN IN COMPETITIVE SPORT
HARDER, JA
CANADIAN FAMILY PHYSICIAN, 1991, 37 : 413 - 417
[9] Competitive sport eligibility
Allibardi, Plerluigi
JOURNAL OF CARDIOVASCULAR MEDICINE, 2008, 9 (12) : 1279 - 1279
[10] Sport Equity: Benchmarking the Performance of English Public Sport Facilities
Liu, Yi-De
Taylor, Peter
Shibli, Simon
EUROPEAN SPORT MANAGEMENT QUARTERLY, 2009, 9 (01) : 3 - 21

← 1 2 3 4 5 →