Linking Confidence Biases to Reinforcement-Learning Processes

被引：7

作者：

Salem-Garcia, Nahuel ^{[1
,2
]}

Palminteri, Stefano ^{[3
,4
]}

Lebreton, Mael ^{[1
,2
,5
,6
]}

机构：

[1] Univ Geneva, Swiss Ctr Affect Sci, Geneva, Switzerland

[2] Univ Geneva, Fac Psychol & Educ Sci, Geneva, Switzerland

[3] PSL Res Univ, Ecole Normale Super, Dept Etud Cognit, Paris, France

[4] INSERM, Lab Neurosci Cognit & Computat, Paris, France

[5] Paris Sch Econ, Paris, France

[6] Paris Sch Econ, 48 Blvd Jourdan, F-75014 Paris, France

来源：

PSYCHOLOGICAL REVIEW | 2023年 / 130卷 / 04期

基金：

欧洲研究理事会; 瑞士国家科学基金会;

关键词：

reinforcement-learning; confidence; computational modeling; learning biases; confidence biases; DECISION-MAKING; NEURAL BASIS; GOOD FIT; OVERCONFIDENCE; MODEL; PROBABILITY; PERFORMANCE; COMPUTATION; CHOICE; MEMORY;

D O I：

10.1037/rev0000424

中图分类号：

B84 [心理学];

学科分类号：

04 ; 0402 ;

摘要：

We systematically misjudge our own performance in simple economic tasks. First, we generally overestimate our ability to make correct choices-a bias called overconfidence. Second, we are more confident in our choices when we seek gains than when we try to avoid losses-a bias we refer to as the valence-induced confidence bias. Strikingly, these two biases are also present in reinforcement-learning (RL) contexts, despite the fact that outcomes are provided trial-by-trial and could, in principle, be used to recalibrate confidence judgments online. How confidence biases emerge and are maintained in reinforcement-learning contexts is thus puzzling and still unaccounted for. To explain this paradox, we propose that confidence biases stem from learning biases, and test this hypothesis using data from multiple experiments, where we concomitantly assessed instrumental choices and confidence judgments, during learning and transfer phases. Our results first show that participants' choices in both tasks are best accounted for by a reinforcement-learning model featuring context-dependent learning and confirmatory updating. We then demonstrate that the complex, biased pattern of confidence judgments elicited during both tasks can be explained by an overweighting of the learned value of the chosen option in the computation of confidence judgments. We finally show that, consequently, the individual learning model parameters responsible for the learning biases-confirmatory updating and outcome context-dependency-are predictive of the individual metacognitive biases. We conclude suggesting that the metacognitive biases originate from fundamentally biased learning computations.

引用

页码：1017 / 1043

页数：27

共 50 条

[1] Orbitofrontal Circuits Control Multiple Reinforcement-Learning Processes
Groman, Stephanie M.
Keistler, Colby
Keip, Alex J.
Hammarlund, Emma
DiLeone, Ralph J.
Pittenger, Christopher
Lee, Daeyeol
Taylor, Jane R.
NEURON, 2019, 103 (04) : 734 - +
[2] A reinforcement-learning approach to efficient communication
Kageback, Mikael
Carlsson, Emil
Dubhashi, Devdatt
Sayeed, Asad
PLOS ONE, 2020, 15 (07):
[3] Coevolutionary networks of reinforcement-learning agents
Kianercy, Ardeshir
Galstyan, Aram
PHYSICAL REVIEW E, 2013, 88 (01):
[4] A reinforcement-learning account of Tourette syndrome
Maia, T.
EUROPEAN PSYCHIATRY, 2017, 41 : S10 - S10
[5] A reinforcement-learning approach to color quantization
Chou, CH
Su, MC
Chang, F
Lai, E
Proceedings of the Sixth IASTED International Conference on Intelligent Systems and Control, 2004, : 94 - 99
[6] A Reinforcement-Learning Approach to Color Quantization
Chou, Chien-Hsing
Su, Mu-Chun
Zhao, Yu-Xiang
Hsu, Fu-Hau
JOURNAL OF APPLIED SCIENCE AND ENGINEERING, 2011, 14 (02): : 141 - 150
[7] Reinforcement-learning in fronto-striatal circuits
Bruno Averbeck
John P. O’Doherty
Neuropsychopharmacology, 2022, 47 : 147 - 162
[8] Modeling Biological Agents Beyond the Reinforcement-Learning Paradigm
Georgeon, Olivier L.
Casado, Remi C.
Matignon, Laetitia A.
6TH ANNUAL INTERNATIONAL CONFERENCE ON BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES (BICA 2015), 2015, 71 : 17 - 22
[9] Reinforcement-Learning Approach Guidelines for Energy Management
Rioual, Yohann
Laurent, Johann
Diguet, Jean-Philippe
JOURNAL OF LOW POWER ELECTRONICS, 2019, 15 (03) : 283 - 293
[10] A reinforcement-learning approach for individual pitch control
Coquelet, Marion
Bricteux, Laurent
Moens, Maud
Chatelain, Philippe
WIND ENERGY, 2022, 25 (08) : 1343 - 1362

← 1 2 3 4 5 →