Trusta: Reasoning about assurance cases with formal methods and large language models

被引:0
|
作者
Chen, Zezhong [1 ]
Deng, Yuxin [1 ]
Du, Wenjie [2 ]
机构
[1] East China Normal Univ, Shanghai Key Lab Trustworthy Comp, Shanghai 200062, Peoples R China
[2] Shanghai Normal Univ, Shanghai 200233, Peoples R China
基金
中国国家自然科学基金;
关键词
Assurance cases; Trustworthiness derivation trees; Large language models; Formal methods; Constraint solving;
D O I
10.1016/j.scico.2025.103288
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Assurance cases can be used to argue for the safety of products in safety engineering. In safety-critical areas, the construction of assurance cases is indispensable. We introduce the Trustworthiness Derivation Tree Analyzer (Trusta), a tool designed to enhance the development and evaluation of assurance cases by integrating formal methods and large language models (LLMs). The tool incorporates a Prolog interpreter and solvers like Z3 and MONA to handle various constraint types, enhancing the precision and efficiency of assurance case assessment. Beyond traditional formal methods, Trusta harnesses the power of LLMs including ChatGPT-3.5, ChatGPT4, and PaLM 2, assisting humans in the development of assurance cases and the writing of formal constraints. Our evaluation, through qualitative and quantitative analyses, shows Trusta's impact on improving assurance case quality and efficiency. Trusta enables junior engineers to reach the skill level of experienced safety experts, narrowing the expertise gap and greatly benefiting those with limited experience. Case studies, including automated guided vehicles (AGVs), demonstrate Trusta's effectiveness in identifying subtle issues and improving the overall trustworthiness of complex systems.
引用
收藏
页数:32
相关论文
共 50 条
  • [41] Exploring the Capacity of Pretrained Language Models for Reasoning about Actions and Change
    He, Weinan
    Huang, Canming
    Xiao, Zhanhao
    Liu, Yongmei
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4629 - 4643
  • [42] From Classification to Clinical Insights: Towards Analyzing and Reasoning About Mobile and Behavioral Health Data With Large Language Models
    Englhardt, Zachary
    Ma, Chengqian
    Morris, Margaret E.
    Chang, Chun-Cheng
    Xu, Xuhai Orson
    Qin, Lianhui
    McDduff, Daniel
    Liu, Xin
    Patel, Shwetak
    Iyer, Vikram
    PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2024, 8 (02):
  • [43] Towards Benchmarking and Improving the Temporal Reasoning Capability of Large Language Models
    Tan, Qingyu
    Ng, Hwee Tou
    Bing, Lidong
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 14820 - 14835
  • [44] Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
    Wei, Jason
    Wang, Xuezhi
    Schuurmans, Dale
    Bosma, Maarten
    Ichter, Brian
    Xia, Fei
    Chi, Ed H.
    Le, Quoc V.
    Zhou, Denny
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [45] An Evaluation of Reasoning Capabilities of Large Language Models in Financial Sentiment Analysis
    Du, Kelvin
    Xing, Frank
    Mao, Rui
    Cambria, Erik
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 189 - 194
  • [46] Large Language Models lack essential metacognition for reliable medical reasoning
    Griot, Maxime
    Hemptinne, Coralie
    Vanderdonckt, Jean
    Yuksel, Demet
    NATURE COMMUNICATIONS, 2025, 16 (01)
  • [47] TIMEBENCH: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models
    Chu, Zheng
    Chen, Jingchang
    Chen, Qianglong
    Yu, Weijiang
    Wang, Haotian
    Liu, Ming
    Qin, Bing
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 1204 - 1228
  • [48] Reasoning in Large Language Models Through Symbolic Math Word Problems
    Gaur, Vedant
    Saunshi, Nikunj
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5889 - 5903
  • [49] ThinkSum: Probabilistic reasoning over sets using large language models
    Ozturkler, Batu
    Malkin, Nikolay
    Wang, Zhen
    Jojic, Nebojsa
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1216 - 1239
  • [50] VISA: Reasoning Video Object Segmentation via Large Language Models
    Yan, Cilin
    Wang, Haochen
    Yan, Shilin
    Jiang, Xiaolong
    Hu, Yao
    Kang, Guoliang
    Xie, Weidi
    Gavves, Efstratios
    COMPUTER VISION - ECCV 2024, PT XV, 2025, 15073 : 98 - 115