A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis

被引:0
|
作者
Stolfo, Alessandro [1 ]
Belinkov, Yonatan [2 ]
Sachan, Mrinmaya [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] Technion IIT, Haifa, Israel
基金
以色列科学基金会; 瑞士国家科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mathematical reasoning in large language models (LMs) has garnered significant attention in recent work, but there is a limited understanding of how these models process and store information related to arithmetic tasks within their architecture. In order to improve our understanding of this aspect of language models, we present a mechanistic interpretation of Transformer-based LMs on arithmetic questions using a causal mediation analysis framework. By intervening on the activations of specific model components and measuring the resulting changes in predicted probabilities, we identify the subset of parameters responsible for specific predictions. This provides insights into how information related to arithmetic is processed by LMs. Our experimental results indicate that LMs process the input by transmitting the information relevant to the query from mid-sequence early layers to the final token using the attention mechanism. Then, this information is processed by a set of MLP modules, which generate result-related information that is incorporated into the residual stream. To assess the specificity of the observed activation dynamics, we compare the effects of different model components on arithmetic queries with other tasks, including number retrieval from prompts and factual knowledge questions.(1)
引用
收藏
页码:7035 / 7052
页数:18
相关论文
共 50 条
  • [31] Cause or moral responsibility? Systematic review and analysis of the influence of narrative on the interpretation of causal reasoning tasks
    Goulette, Valentin
    Thrierr, Jasmine
    Verkampt, Fanny
    INTERNATIONAL REVIEW OF PRAGMATICS, 2025, 17 (01) : 91 - 121
  • [32] Using mediation analysis to identify causal mechanisms in disease management interventions
    Linden A.
    Karlson K.B.
    Health Services and Outcomes Research Methodology, 2013, 13 (2-4) : 86 - 108
  • [33] Using instrumental variables to address unmeasured confounding in causal mediation analysis
    Rudolph, Kara E.
    Williams, Nicholas
    Diaz, Ivan
    BIOMETRICS, 2024, 80 (01)
  • [34] Longitudinal mediation analysis with multilevel and latent growth models: a separable effects causal approach
    Di Maria, Chiara
    Didelez, Vanessa
    BMC MEDICAL RESEARCH METHODOLOGY, 2024, 24 (01)
  • [35] Spline linear mixed-effects models for causal mediation analysis with longitudinal data
    Albert, Jeffrey M.
    Zhu, Hongxu
    Dey, Tanujit
    Sun, Jiayang
    Woyczynski, Wojbor A.
    Powers, Gregory
    Min, Meeyoung
    AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2024, 66 (03) : 347 - 366
  • [36] USING CAUSAL REASONING FOR AUTOMATED FAILURE MODES AND EFFECTS ANALYSIS (FMEA)
    BELL, D
    COX, L
    JACKSON, S
    SCHAEFER, P
    PROCEEDINGS ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM, 1992, (SYM): : 343 - 353
  • [37] ThinkSum: Probabilistic reasoning over sets using large language models
    Ozturkler, Batu
    Malkin, Nikolay
    Wang, Zhen
    Jojic, Nebojsa
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1216 - 1239
  • [38] Commonsense Reasoning and Explainable Artificial Intelligence Using Large Language Models
    Krause, Stefanie
    Stolzenburg, Frieder
    ARTIFICIAL INTELLIGENCE-ECAI 2023 INTERNATIONAL WORKSHOPS, PT 1, XAI3, TACTIFUL, XI-ML, SEDAMI, RAAIT, AI4S, HYDRA, AI4AI, 2023, 2024, 1947 : 302 - 319
  • [39] An Evaluation of Reasoning Capabilities of Large Language Models in Financial Sentiment Analysis
    Du, Kelvin
    Xing, Frank
    Mao, Rui
    Cambria, Erik
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 189 - 194
  • [40] Interpretation of divers' symbolic language by using hidden Markov models
    Menix, Mario
    Miskovic, Nikola
    Vukic, Zoran
    2014 37TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2014, : 976 - 981