Modeling Biological Agents Beyond the Reinforcement-Learning Paradigm

被引:5
|
作者
Georgeon, Olivier L. [1 ]
Casado, Remi C. [1 ]
Matignon, Laetitia A. [1 ]
机构
[1] Univ Lyon 1, LIRIS, UMR5205, F-69622 Villeurbanne, France
关键词
D O I
10.1016/j.procs.2015.12.179
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
It is widely acknowledged that biological beings (animals) are not Markov: modelers generally do not model them as agents receiving a complete representation of their environment's state in input (except perhaps in simple controlled tasks). In this paper, we claim that biological beings generally cannot recognize rewarding Markov states of their environment either. Therefore, we model them as agents trying to perform rewarding interactions with their environment (interaction-driven tasks), but not as agents trying to reach rewarding states (state-driven tasks). We review two interaction-driven tasks: the AB and AABB task, and implement a non-Markov Reinforcement-Learning (RL) algorithm based upon historical sequences and Q-learning. Results show that this RL algorithm takes significantly longer than a constructivist algorithm implemented previously by Georgeon, Ritter, & Haynes (2009). This is because the constructivist algorithm directly learns and repeats hierarchical sequences of interactions, whereas the RL algorithm spends time learning Q-values. Along with theoretical arguments, these results support the constructivist paradigm for modeling biological agents.
引用
收藏
页码:17 / 22
页数:6
相关论文
共 50 条
  • [31] Modeling yard crane operators as reinforcement learning agents
    Fotuhi, Faterne
    Huynh, Nathan
    Vidal, Jose M.
    Xie, Yuanchang
    RESEARCH IN TRANSPORTATION ECONOMICS, 2013, 42 : 3 - 12
  • [32] Reinforcement-learning agents for architects' trade-offs in designing children's play environment: A qualitative comparative analysis
    Lee, Jin
    Hong, Seung Wan
    Cho, Chang-Yeon
    DESIGN STUDIES, 2024, 91-92
  • [33] Towards safe reinforcement-learning in industrial grid-warehousing
    Andersen, Per-Arne
    Goodwin, Morten
    Granmo, Ole-Christoffer
    INFORMATION SCIENCES, 2020, 537 : 467 - 484
  • [34] A Reinforcement-Learning Based Cognitive Scheme for Opportunistic Spectrum Access
    Kordali, Angeliki V.
    Cottis, Panayotis G.
    WIRELESS PERSONAL COMMUNICATIONS, 2016, 86 (02) : 751 - 769
  • [35] Evolution of cooperation on reinforcement-learning driven-adaptive networks
    Du, Chunpeng
    Lu, Yikang
    Meng, Haoran
    Park, Junpyo
    CHAOS, 2024, 34 (04)
  • [36] An intelligent controller based on fuzzy target acquired by reinforcement-learning
    Yasunobu, Seiji
    2006 SICE-ICASE INTERNATIONAL JOINT CONFERENCE, VOLS 1-13, 2006, : 94 - 99
  • [37] Enhancing stochastic resonance using a reinforcement-learning based method
    Ding, Jianpeng
    Lei, Youming
    JOURNAL OF VIBRATION AND CONTROL, 2023, 29 (7-8) : 1461 - 1471
  • [38] IEEE 802.15.4 differentiated service strategy based on reinforcement-learning
    College of Communication Engineering, Jilin University, Changchun
    130012, China
    Tongxin Xuebao, 8
  • [39] Efficient reinforcement-learning control algorithm using experience reuse
    Hao, Chuan-Chuan
    Fang, Zhou
    Li, Ping
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2012, 40 (06): : 70 - 75
  • [40] Computably Continuous Reinforcement-Learning Objectives Are PAC-Learnable
    Yang, Cambridge
    Littman, Michael
    Carbin, Michael
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10729 - 10736