Trusta: Reasoning about assurance cases with formal methods and large language models

被引：0

作者：

Chen, Zezhong ^{[1
]}

Deng, Yuxin ^{[1
]}

Du, Wenjie ^{[2
]}

机构：

[1] East China Normal Univ, Shanghai Key Lab Trustworthy Comp, Shanghai 200062, Peoples R China

[2] Shanghai Normal Univ, Shanghai 200233, Peoples R China

来源：

SCIENCE OF COMPUTER PROGRAMMING | 2025年 / 244卷

基金：

中国国家自然科学基金;

关键词：

Assurance cases; Trustworthiness derivation trees; Large language models; Formal methods; Constraint solving;

D O I：

10.1016/j.scico.2025.103288

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Assurance cases can be used to argue for the safety of products in safety engineering. In safety-critical areas, the construction of assurance cases is indispensable. We introduce the Trustworthiness Derivation Tree Analyzer (Trusta), a tool designed to enhance the development and evaluation of assurance cases by integrating formal methods and large language models (LLMs). The tool incorporates a Prolog interpreter and solvers like Z3 and MONA to handle various constraint types, enhancing the precision and efficiency of assurance case assessment. Beyond traditional formal methods, Trusta harnesses the power of LLMs including ChatGPT-3.5, ChatGPT4, and PaLM 2, assisting humans in the development of assurance cases and the writing of formal constraints. Our evaluation, through qualitative and quantitative analyses, shows Trusta's impact on improving assurance case quality and efficiency. Trusta enables junior engineers to reach the skill level of experienced safety experts, narrowing the expertise gap and greatly benefiting those with limited experience. Case studies, including automated guided vehicles (AGVs), demonstrate Trusta's effectiveness in identifying subtle issues and improving the overall trustworthiness of complex systems.

引用

页数：32

共 50 条

[21] TRAM: Benchmarking Temporal Reasoning for Large Language Models
Wang, Yuqing
Zhao, Yun
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 6389 - 6415
[22] EconNLI: Evaluating Large Language Models on Economics Reasoning
Guo, Yue
Yang, Yi
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 982 - 994
[23] Evaluating Large Language Models for Tax Law Reasoning
Cavalcante Presa, Joao Paulo
Camilo Junior, Celso Goncalves
Teles de Oliveira, Savio Salvarino
INTELLIGENT SYSTEMS, BRACIS 2024, PT I, 2025, 15412 : 460 - 474
[24] Targeted training for numerical reasoning with large language models
Li, Xiao
Liu, Sichen
Zhu, Yin
Cheng, Gong
KNOWLEDGE AND INFORMATION SYSTEMS, 2025, 67 (01) : 197 - 221
[25] Automatic Model Selection with Large Language Models for Reasoning
Zhao, James Xu
Xie, Yuxi
Kawaguchi, Kenji
He, Junxian
Xie, Michael Qizhe
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 758 - 783
[26] NEWTON: Are Large Language Models Capable of Physical Reasoning?
Wang, Yi Ru
Du, Jiafei
Fox, Dieter
Srinivasa, Siddhartha
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 9743 - 9758
[27] Dynamic Voting for Efficient Reasoning in Large Language Models
Xue, Mingfeng
Liu, Dayiheng
Lei, Wenqiang
Ren, Xingzhang
Yang, Baosong
Xie, Jun
Zhang, Yidan
Peng, Dezhong
Lv, Jiancheng
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3085 - 3104
[28] Reasoning with large language models for medical question answering
Lucas, Mary M.
Yang, Justin
Pomeroy, Jon K.
Yang, Christopher C.
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (09)
[29] Rationality of Thought Improves Reasoning in Large Language Models
Gou, Tian
Zhang, Boyao
Sun, Zhenglie
Wang, Jing
Liu, Fang
Wang, Yangang
Wang, Jue
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT IV, KSEM 2024, 2024, 14887 : 343 - 358
[30] Isabelle/SACM: Computer-Assisted Assurance Cases with Integrated Formal Methods
Nemouchi, Yakoub
Foster, Simon
Gleirscher, Mario
Kelly, Tim
INTEGRATED FORMAL METHODS, IFM 2019, 2019, 11918 : 379 - 398

← 1 2 3 4 5 →