ChatAssert: LLM-Based Test Oracle Generation With External Tools Assistance

被引：0

作者：

Hayet, Ishrak ^{[1
]}

Scott, Adam ^{[1
]}

d'Amorim, Marcelo ^{[1
]}

机构：

[1] North Carolina State Univ, Raleigh, NC 27695 USA

来源：

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING | 2025年 / 51卷 / 01期

基金：

美国国家科学基金会;

关键词：

Chatbots; Codes; Measurement; Prompt engineering; Maintenance engineering; Large language models; Accuracy; Static analysis; Standards; Semantics; Test oracle generation; large language models (LLMs); tool-augmented LLMs; prompt engineering framework;

D O I：

10.1109/TSE.2024.3519159

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Test oracle generation is an important and challenging problem. Neural-based solutions have been recently proposed for oracle generation but they are still inaccurate. For example, the accuracy of the state-of-the-art technique teco is only 27.5% on its dataset including 3,540 test cases. We propose ChatAssert, a prompt engineering framework designed for oracle generation that uses dynamic and static information to iteratively refine prompts for querying large language models (LLMs). ChatAssert uses code summaries and examples to assist an LLM in generating candidate test oracles, uses a lightweight static analysis to assist the LLM in repairing generated oracles that fail to compile, and uses dynamic information obtained from test runs to help the LLM in repairing oracles that compile but do not pass. Experimental results using an independent publicly-available dataset show that ChatAssert improves the state-of-the-art technique, teco, on key evaluation metrics. For example, it improves Acc@1 by 15%. Overall, results provide initial yet strong evidence that using external tools in the formulation of prompts is an important aid in LLM-based oracle generation.

引用

页码：305 / 319

页数：15

共 37 条

[21] Evaluating LLM-based generative AI tools in emergency triage: A comparative study of ChatGPT Plus, Copilot Pro, and triage nurses
Arslan, B.
Nuhoglu, C.
Satici, M. O.
Altinbilek, E.
AMERICAN JOURNAL OF EMERGENCY MEDICINE, 2025, 89 : 174 - 181
[22] GenG: An LLM-based Generic Time Series Data Generation Approach for Edge Intelligence via Cross-domain Collaboration
Zhou, Xiaomao
Jia, Qingmin
Hu, Yujiao
Xie, Renchao
Huang, Tao
Yu, E. Richard
IEEE INFOCOM 2024-IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS, INFOCOM WKSHPS 2024, 2024,
[23] An Innovative Solution to Design Problems: Applying the Chain-of-Thought Technique to Integrate LLM-Based Agents With Concept Generation Methods
Ge, Shijun
Sun, Yuanbo
Cui, Yin
Wei, Dapeng
IEEE ACCESS, 2025, 13 : 10499 - 10512
[24] EFSM-Based Test Case Generation: Sequence, Data, and Oracle
Yang, Rui
Chen, Zhenyu
Zhang, Zhiyi
Xu, Baowen
INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2015, 25 (04) : 633 - 667
[25] Generation Method for Test Oracle Based on Sensitive Variables and Linear Perceptron
Ma C.-Y.
Li S.-R.
Wang H.-C.
Zhang L.
Zhang T.
Ruan Jian Xue Bao/Journal of Software, 2019, 30 (05): : 1450 - 1463
[26] Comparative Analysis of M4CXR, an LLM-Based Chest X-Ray Report Generation Model, and ChatGPT in Radiological Interpretation
Lee, Ro Woon
Lee, Kyu Hong
Yun, Jae Sung
Kim, Myung Sub
Choi, Hyun Seok
JOURNAL OF CLINICAL MEDICINE, 2024, 13 (23)
[27] Automatic Test Case and Test Oracle Generation Based on Functional Scenarios in Formal Specifications for Conformance Testing
Liu, Shaoying
Nakajima, Shin
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (02) : 691 - 712
[28] Test Oracle Generation Based on BPNN by Using the Values of Variables at Different Breakpoints for Programs
Ma, Chunyan
Liu, Shaoying
Fu, Jinglan
Zhang, Tao
INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2021, 31 (10) : 1469 - 1494
[29] Fault Localization and Test Oracle Generation Based on the Mutual Pattern of Discrete Path Variables
Chen, Jing
Ma, Chunyan
Chang, Zheng
2021 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C 2021), 2021, : 326 - 332
[30] Model-Based Test Oracle Generation for Automated Unit Testing of Agent Systems
Padgham, Lin
Zhang, Zhiyong
Thangarajah, John
Miller, Tim
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2013, 39 (09) : 1230 - 1244

← 1 2 3 4 →