ChatAssert: LLM-Based Test Oracle Generation With External Tools Assistance

被引:0
|
作者
Hayet, Ishrak [1 ]
Scott, Adam [1 ]
d'Amorim, Marcelo [1 ]
机构
[1] North Carolina State Univ, Raleigh, NC 27695 USA
基金
美国国家科学基金会;
关键词
Chatbots; Codes; Measurement; Prompt engineering; Maintenance engineering; Large language models; Accuracy; Static analysis; Standards; Semantics; Test oracle generation; large language models (LLMs); tool-augmented LLMs; prompt engineering framework;
D O I
10.1109/TSE.2024.3519159
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Test oracle generation is an important and challenging problem. Neural-based solutions have been recently proposed for oracle generation but they are still inaccurate. For example, the accuracy of the state-of-the-art technique teco is only 27.5% on its dataset including 3,540 test cases. We propose ChatAssert, a prompt engineering framework designed for oracle generation that uses dynamic and static information to iteratively refine prompts for querying large language models (LLMs). ChatAssert uses code summaries and examples to assist an LLM in generating candidate test oracles, uses a lightweight static analysis to assist the LLM in repairing generated oracles that fail to compile, and uses dynamic information obtained from test runs to help the LLM in repairing oracles that compile but do not pass. Experimental results using an independent publicly-available dataset show that ChatAssert improves the state-of-the-art technique, teco, on key evaluation metrics. For example, it improves Acc@1 by 15%. Overall, results provide initial yet strong evidence that using external tools in the formulation of prompts is an important aid in LLM-based oracle generation.
引用
收藏
页码:305 / 319
页数:15
相关论文
共 37 条
  • [21] Evaluating LLM-based generative AI tools in emergency triage: A comparative study of ChatGPT Plus, Copilot Pro, and triage nurses
    Arslan, B.
    Nuhoglu, C.
    Satici, M. O.
    Altinbilek, E.
    AMERICAN JOURNAL OF EMERGENCY MEDICINE, 2025, 89 : 174 - 181
  • [22] GenG: An LLM-based Generic Time Series Data Generation Approach for Edge Intelligence via Cross-domain Collaboration
    Zhou, Xiaomao
    Jia, Qingmin
    Hu, Yujiao
    Xie, Renchao
    Huang, Tao
    Yu, E. Richard
    IEEE INFOCOM 2024-IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS, INFOCOM WKSHPS 2024, 2024,
  • [23] An Innovative Solution to Design Problems: Applying the Chain-of-Thought Technique to Integrate LLM-Based Agents With Concept Generation Methods
    Ge, Shijun
    Sun, Yuanbo
    Cui, Yin
    Wei, Dapeng
    IEEE ACCESS, 2025, 13 : 10499 - 10512
  • [24] EFSM-Based Test Case Generation: Sequence, Data, and Oracle
    Yang, Rui
    Chen, Zhenyu
    Zhang, Zhiyi
    Xu, Baowen
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2015, 25 (04) : 633 - 667
  • [25] Generation Method for Test Oracle Based on Sensitive Variables and Linear Perceptron
    Ma C.-Y.
    Li S.-R.
    Wang H.-C.
    Zhang L.
    Zhang T.
    Ruan Jian Xue Bao/Journal of Software, 2019, 30 (05): : 1450 - 1463
  • [26] Comparative Analysis of M4CXR, an LLM-Based Chest X-Ray Report Generation Model, and ChatGPT in Radiological Interpretation
    Lee, Ro Woon
    Lee, Kyu Hong
    Yun, Jae Sung
    Kim, Myung Sub
    Choi, Hyun Seok
    JOURNAL OF CLINICAL MEDICINE, 2024, 13 (23)
  • [27] Automatic Test Case and Test Oracle Generation Based on Functional Scenarios in Formal Specifications for Conformance Testing
    Liu, Shaoying
    Nakajima, Shin
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2022, 48 (02) : 691 - 712
  • [28] Test Oracle Generation Based on BPNN by Using the Values of Variables at Different Breakpoints for Programs
    Ma, Chunyan
    Liu, Shaoying
    Fu, Jinglan
    Zhang, Tao
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2021, 31 (10) : 1469 - 1494
  • [29] Fault Localization and Test Oracle Generation Based on the Mutual Pattern of Discrete Path Variables
    Chen, Jing
    Ma, Chunyan
    Chang, Zheng
    2021 21ST INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY COMPANION (QRS-C 2021), 2021, : 326 - 332
  • [30] Model-Based Test Oracle Generation for Automated Unit Testing of Agent Systems
    Padgham, Lin
    Zhang, Zhiyong
    Thangarajah, John
    Miller, Tim
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2013, 39 (09) : 1230 - 1244