HINT: Integration Testing for AI-based features with Humans in the Loop

被引:8
|
作者
Chen, Quanze [1 ]
Schnabel, Tobias [2 ]
Nushi, Besmira [2 ]
Amershi, Saleema [2 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
[2] Microsoft Res, Redmond, WA USA
关键词
Human-AI interaction; prototyping; testing; crowdsourcing; AUTOMATION; TRUST;
D O I
10.1145/3490099.3511141
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The dynamic nature of AI technologies makes testing human-AI interaction and collaboration challenging - especially before such features are deployed in the wild. This presents a challenge for designers and AI practitioners as early feedback for iteration is often unavailable in the development phase. In this paper, we take inspiration from integration testing concepts in software development and present HINT (Human-AI INtegration Testing), a crowd-based framework for testing AI-based experiences integrated with a humans-in-the-loop workflow. HINT supports early testing of AI-based features within the context of realistic user tasks and makes use of successive sessions to simulate AI experiences that evolve over-time. Finally, it provides practitioners with reports to evaluate and compare aspects of these experiences. Through a crowd-based study, we demonstrate the need for overtime testing where user behaviors evolve as they interact with an AI system. We also show that HINT is able to capture and reveal these distinct user behavior patterns across a variety of common AI performance modalities using two AI-based feature prototypes. We further evaluated HINT's potential to support practitioners' evaluation of human-AI interaction experiences pre-deployment through semi-structured interviews with 13 practitioners.
引用
收藏
页码:549 / 565
页数:17
相关论文
共 50 条
  • [1] AI-BASED HYPOTHESIS TESTING IN INDIVIDUALS WITH CF
    Bellot, A.
    Floto, R. A.
    van der Schaar, M.
    PEDIATRIC PULMONOLOGY, 2020, 55 : S113 - S113
  • [2] Integration of AI-based histopathology and genomic mutation analysis
    Saito, Akira
    Umezu, Tomohiro
    Kuroda, Masahiko
    CANCER SCIENCE, 2023, 114 : 1468 - 1468
  • [3] Towards Dependable Integration Concepts for AI-Based Systems
    Macher, Georg
    Blazevic, Romana
    Veledar, Omar
    Brenner, Eugen
    COMPUTER SAFETY, RELIABILITY, AND SECURITY, SAFECOMP 2023 WORKSHOPS, 2023, 14182 : 108 - 117
  • [4] AI-T: Software Testing Ontology for AI-based Systems
    Olszewska, J., I
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KEOD), VOL 2, 2020, : 291 - 298
  • [5] A Classification Study on Testing and Verification of AI-based Systems
    De Angelis, Emanuele
    De Angelis, Guglielmo
    Proietti, Maurizio
    2023 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING, AITEST, 2023, : 1 - 8
  • [6] Metamorphic Testing of AI-based Applications: A Critical Review
    Khokhar, Muhammad Nadeem
    Bashir, Muhammad Bilal
    Fiaz, Muhammad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (04) : 754 - 761
  • [7] AI-based solar energy forecasting for smart grid integration
    Yahia Said
    Abdulaziz Alanazi
    Neural Computing and Applications, 2023, 35 : 8625 - 8634
  • [8] AI-based solar energy forecasting for smart grid integration
    Said, Yahia
    Alanazi, Abdulaziz
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (11): : 8625 - 8634
  • [9] Integration of an AI-Based Platform and Flipped Classroom Instructional Model
    Li, Bing
    Peng, Miaomiao
    SCIENTIFIC PROGRAMMING, 2022, 2022
  • [10] Employees' perceptions of the fairness of AI-based performance prediction features
    Majrashi, Khalid
    COGENT BUSINESS & MANAGEMENT, 2025, 12 (01):