Learning to Model Opponent Learning (Student Abstract)

被引:0
|
作者
Davies, Ian [1 ]
Tian, Zheng [1 ]
Wang, Jun [1 ]
机构
[1] UCL, Gower St, London WC1E 6BT, England
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-Agent Reinforcement Learning (MARL) considers settings in which a set of coexisting agents interact with one another and their environment. The adaptation and learning of other agents induces non-stationarity in the environment dynamics. This poses a great challenge for value function-based algorithms whose convergence usually relies on the assumption of a stationary environment. Policy search algorithms also struggle in multi-agent settings as the partial observability resulting from an opponent's actions not being known introduces high variance to policy training. Modelling an agent's opponent(s) is often pursued as a means of resolving the issues arising from the coexistence of learning opponents. An opponent model provides an agent with some ability to reason about other agents to aid its own decision making. Most prior works learn an opponent model by assuming the opponent is employing a stationary policy or switching between a set of stationary policies. Such an approach can reduce the variance of training signals for policy search algorithms. However, in the multi-agent setting, agents have an incentive to continually adapt and learn. This means that the assumptions concerning opponent stationarity are unrealistic. In this work, we develop a novel approach to modelling an opponent's learning dynamics which we term Learning to Model Opponent Learning (LeMOL). We show our structured opponent model is more accurate and stable than naive behaviour cloning baselines. We further show that opponent modelling can improve the performance of algorithmic agents in multi-agent settings.
引用
收藏
页码:13771 / 13772
页数:2
相关论文
共 50 条
  • [21] BertRLFuzzer: A BERT and Reinforcement Learning Based Fuzzer (Student Abstract)
    Jha, Piyush
    Scott, Joseph
    Ganeshna, Jaya Sriram
    Singh, Mudit
    Ganesh, Vijay
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23521 - 23522
  • [22] AsyncFL: Asynchronous Federated Learning Using Majority Voting with Quantized Model Updates (Student Abstract)
    Jang, Suji
    Lim, Hyuk
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 12975 - 12976
  • [23] COLA: Consistent Learning with Opponent-Learning Awareness
    Willi, Timon
    Letcher, Alistair
    Treutlein, Johannes
    Foerster, Jakob
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [24] Semi-Supervised Learning via Triplet Network Based Active Learning (Student Abstract)
    Sundriyal, Divyanshu
    Ghosh, Soumyadeep
    Vatsa, Mayank
    Singh, Richa
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15903 - 15904
  • [25] The flipped learning model in teaching abstract algebra
    Forbregd, Tore Alexander
    Lada, Magdalini
    PROCEEDINGS OF THE TENTH CONGRESS OF THE EUROPEAN SOCIETY FOR RESEARCH IN MATHEMATICS EDUCATION (CERME10), 2017, : 2310 - 2311
  • [26] Learning strategies based on fuzzy set rules for the ideal opponent model
    Iqbal, Nadeem
    Kamran, Raza
    THIRD INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES 2007, PROCEEDINGS, 2007, : 199 - 204
  • [27] Assessing Student Learning in Instructional Technology: Dimensions of a Learning Model
    Liu, Leping
    Johnson, D. LaMont
    Computers in the Schools, 2001, 18 (2-3) : 79 - 95
  • [28] Encoding Temporal and Spatial Vessel Context using Self-Supervised Learning Model (Student Abstract)
    Bernabe, Pierre
    Spieker, Helge
    Legeard, Bruno
    Gotlieb, Arnaud
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15757 - 15758
  • [29] Contrastive Learning for Low-Light Raw Denoising (Student Abstract)
    Cui, Taoyong
    Dong, Yuhan
    THIRTY-EIGTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 21, 2024, : 23465 - 23467
  • [30] Student Employment as a Model for Experiential Learning
    Fede, Jacquelyn H.
    Gorman, Kathleen S.
    Cimini, Maria E.
    JOURNAL OF EXPERIENTIAL EDUCATION, 2018, 41 (01) : 107 - 124