Reward Reports for Reinforcement Learning

被引:8
|
作者
Gilbert, Thomas Krendl [1 ]
Lambert, Nathan [2 ]
Dean, Sarah [3 ]
Zick, Tom [4 ]
Snoswell, Aaron [5 ]
Mehta, Soham [6 ]
机构
[1] Cornell Tech, Digital Life Initiat, New York, NY 10044 USA
[2] HuggingFace, Berkeley, CA USA
[3] Cornell Univ, Ithaca, NY USA
[4] Harvard Law Sch, Boston, MA USA
[5] Queensland Univ Technol, Ctr Automated Decis Making & Soc, Brisbane, Qld, Australia
[6] Columbia Univ, New York, NY USA
关键词
Reward function; reporting; documentation; disaggregated evaluation; ethical considerations; MODEL; GO;
D O I
10.1145/3600211.3604698
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Building systems that are good for society in the face of complex societal effects requires a dynamic approach. Recent approaches to machine learning (ML) documentation have demonstrated the promise of discursive frameworks for deliberation about these complexities. However, these developments have been grounded in a static ML paradigm, leaving the role of feedback and post-deployment performance unexamined. Meanwhile, recent work in reinforcement learning has shown that the effects of feedback and optimization objectives on system behavior can be wide-ranging and unpredictable. In this paper we sketch a framework for documenting deployed and iteratively updated learning systems, which we call Reward Reports. Taking inspiration from technical concepts in reinforcement learning, we outline Reward Reports as living documents that track updates to design choices and assumptions behind what a particular automated system is optimizing for. They are intended to track dynamic phenomena arising from system deployment, rather than merely static properties of models or data. After presenting the elements of a Reward Report, we discuss a concrete example: Meta's BlenderBot 3 chatbot. Several others for game-playing (DeepMind's MuZero), content recommendation (MovieLens), and traffic control (Project Flow) are included in the appendix.
引用
收藏
页码:84 / 130
页数:47
相关论文
共 50 条
  • [21] Active Learning for Reward Estimation in Inverse Reinforcement Learning
    Lopes, Manuel
    Melo, Francisco
    Montesano, Luis
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2009, 5782 : 31 - +
  • [22] Learning Reward Machines for Partially Observable Reinforcement Learning
    Icarte, Rodrigo Toro
    Waldie, Ethan
    Klassen, Toryn Q.
    Valenzano, Richard
    Castro, Margarita P.
    McIlraith, Sheila A.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [23] Maximum reward reinforcement learning: A non-cumulative reward criterion
    Quah, K. H.
    Quek, Chai
    EXPERT SYSTEMS WITH APPLICATIONS, 2006, 31 (02) : 351 - 359
  • [24] Direct reward and indirect reward in multi-agent reinforcement learning
    Ohta, M
    ROBOCUP 2002: ROBOT SOCCER WORLD CUP VI, 2003, 2752 : 359 - 366
  • [25] Reinforcement Learning with Reward Shaping and Hybrid Exploration in Sparse Reward Scenes
    Yang, Yulong
    Cao, Weihua
    Guo, Linwei
    Gan, Chao
    Wu, Min
    2023 IEEE 6TH INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER-PHYSICAL SYSTEMS, ICPS, 2023,
  • [26] Reinforcement learning and the reward positivity with aversive outcomes
    Bauer, Elizabeth A.
    Watanabe, Brandon K.
    Macnamara, Annmarie
    PSYCHOPHYSIOLOGY, 2024, 61 (04)
  • [27] Reward Certification for Policy Smoothed Reinforcement Learning
    Mu, Ronghui
    Marcolino, Leandro Soriano
    Zhang, Yanghao
    Zhang, Tianle
    Huang, Xiaowei
    Ruan, Wenjie
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21429 - 21437
  • [28] Direct reward and indirect reward in multi-agent reinforcement learning
    Ohta, M. (ohta@carc.aist.go.jp), (Springer Verlag):
  • [29] A Modified Average Reward Reinforcement Learning Based on Fuzzy Reward Function
    Zhai, Zhenkun
    Chen, Wei
    Li, Xiong
    Guo, Jing
    IMECS 2009: INTERNATIONAL MULTI-CONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2009, : 113 - 117
  • [30] Reinforcement Learning in Reward-Mixing MDPs
    Kwon, Jeongyeol
    Efroni, Yonathan
    Caramanis, Constantine
    Mannor, Shie
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34