Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs

被引:0
|
作者
Wang, Siyuan [1 ]
Wei, Zhongyu [1 ]
Choi, Yejin [2 ,4 ]
Ren, Xiang [3 ,4 ]
机构
[1] Fudan Univ, Shanghai, Peoples R China
[2] Univ Washington, Seattle, WA USA
[3] Univ Southern Calif, Los Angeles, CA USA
[4] Allen Inst Artificial Intelligence, Seattle, WA USA
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large language models (LLMs) have achieved impressive human-like performance across various reasoning tasks. However, their mastery of underlying inferential rules still falls short of human capabilities. To investigate this, we propose a logic scaffolding inferential rule generation framework, to construct an inferential rule base, ULogic, comprising both primitive and compositional rules across five domains. Our analysis of GPT-series models over a rule subset reveals significant gaps in LLMs' logic understanding compared to human performance, especially in compositional and structural complex rules with certain bias patterns. We further distill these rules into a smaller-scale inference engine for flexible rule generation and enhancing downstream reasoning. Through a multi-judger evaluation, our inference engine proves effective in generating accurate, complex and abstract conclusions and premises, and improve various commonsense reasoning tasks. Overall, our work sheds light on LLMs' limitations in grasping inferential rule and suggests ways to enhance their logical reasoning abilities (1).
引用
收藏
页码:7523 / 7543
页数:21
相关论文
共 3 条
  • [1] Logic-Scaffolding: Personalized Aspect-Instructed Recommendation Explanation Generation using LLMs
    Randari, Behnam
    Ding, Hao
    Fan, Ziwei
    Yifei
    Chen, Zhuotong
    Deoras, Anoop
    Kveton, Branislav
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1078 - 1081
  • [2] IMPROVING THE DIAGNOSTIC PREDICTIVE POWER OF ERGOMETRIC STRESS-TESTING WITH CARDIOKYMOGRAPHY
    DREXLER, L
    ANISICVOGT, A
    ALTENSTRASSER, P
    PICHLER, M
    ACTA MEDICA AUSTRIACA, 1981, 8 : 167 - 167
  • [3] CAN EXERCISE STRESS-TESTING BE USEFUL IN CHRONIC PAIN POPULATIONS
    WONGSAM, PE
    PELLEGRINO, M
    KAUER, P
    BOWEN, J
    ARCHIVES OF PHYSICAL MEDICINE AND REHABILITATION, 1986, 67 (09): : 638 - 638