A Testing Framework for AI Linguistic Systems (testFAILS)

被引：0

作者：

Kumar, Y. ^{[1
]}

Morreale, P. ^{[1
]}

Sorial, P. ^{[1
]}

Delgado, J. ^{[1
]}

Li, J. Jenny ^{[1
]}

Martins, P. ^{[1
]}

机构：

[1] Kean Univ, Dept Comp Sci & Technol, Union, NJ 07083 USA

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING, AITEST | 2023年

关键词：

Chatbots; Validation of Chatbots; Bot Technologies; AI Linguistic Systems Testing Framework (testFAILS); AIDoctor;

D O I：

10.1109/AITest58265.2023.00017

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces testFAILS, an innovative testing framework designed for the rigorous evaluation of AI Linguistic Systems, with a particular emphasis on various iterations of ChatGPT. Leveraging orthogonal array coverage, this framework provides a robust mechanism for assessing AI systems, addressing the critical question, "How should we evaluate AI?" While the Turing test has traditionally been the benchmark for AI evaluation, we argue that current publicly available chatbots, despite their rapid advancements, have yet to meet this standard. However, the pace of progress suggests that achieving Turing test-level performance may be imminent. In the interim, the need for effective AI evaluation and testing methodologies remains paramount. Our research, which is ongoing, has already validated several versions of ChatGPT, and we are currently conducting comprehensive testing on the latest models, including ChatGPT-4, Bard, Bing Bot, and the LLaMA model. The testFAILS framework is designed to be adaptable, ready to evaluate new bot versions as they are released. Additionally, we have tested available chatbot APIs and developed our own application, AIDoctor, utilizing the ChatGPT-4 model and Microsoft Azure AI technologies.

引用

页码：51 / 54

页数：4

共 50 条

[31] Unit Testing Framework for Embedded Component Systems
Morisaki, Shuichiro
Shirata, Seito
Oyama, Hiroshi
Azumi, Takuya
2020 IEEE 18TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING, EUC 2020, 2020, : 41 - 48
[32] An Observable and Controllable Testing Framework for Modern Systems
Yu, Tingting
PROCEEDINGS OF THE 35TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2013), 2013, : 1377 - 1380
[33] Automated Testing Framework for Embedded Component Systems
Tomimori, Hinata
Oyama, Hiroshi
Azumi, Takuya
2023 IEEE 26TH INTERNATIONAL SYMPOSIUM ON REAL-TIME DISTRIBUTED COMPUTING, ISORC, 2023, : 176 - 183
[34] A Framework for optimizing effort in Testing of System of Systems
Bera, Padmalochan
Pasala, Anjaneyulu
2012 THIRD INTERNATIONAL CONFERENCE ON SERVICES IN EMERGING MARKETS (ICSEM), 2012, : 136 - 141
[35] Empirical testing of an information systems evaluation framework
Irani, Zahir
International Journal of Information Technology and Management, 2002, 1 (2-3) : 298 - 323
[36] A Generic Framework for Testing Parallel File Systems
Cao, Jinrui
Wang, Simeng
Dai, Dong
Zheng, Mai
Chen, Yong
PROCEEDINGS OF PDSW-DISCS 2016 - 1ST JOINT INTERNATIONAL WORKSHOP ON PARALLEL DATA STORAGE AND DATA INTENSIVE SCALABLE COMPUTING SYSTEMS, 2016, : 49 - 54
[37] Creating a Framework for Testing Wellness Visualization Systems
Soomlek, Chitsutha
Benedicenti, Luigi
PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON EHEALTH, TELEMEDICINE, AND SOCIAL MEDICINE (ETELEMED 2011), 2011, : 83 - 88
[38] Sensitive Region-based Metamorphic Testing Framework using Explainable AI
Torikoshi, Yuma
Nishi, Yasuharu
Takahashi, Juichi
2023 IEEE/ACM 8TH INTERNATIONAL WORKSHOP ON METAMORPHIC TESTING, MET, 2023, : 25 - 30
[39] Virtual Testing Methods of Cyber-Physical Systems: A Framework for Testing Instrumentation and Measurement Systems
Saleh, Mahdi
Elhajj, Imad H.
Asmar, Daniel
IEEE INSTRUMENTATION & MEASUREMENT MAGAZINE, 2024, 27 (08) : 11 - 15
[40] A framework for linguistic modelling
Lawry, J
ARTIFICIAL INTELLIGENCE, 2004, 155 (1-2) : 1 - 39

← 1 2 3 4 5 →