Distributional Deep Reinforcement Learning with a Mixture of Gaussians

被引：0

作者：

Choi, Yunho ^{[1
,2
]}

Lee, Kyungjae ^{[1
,2
]}

Oh, Songhwai ^{[1
,2
]}

机构：

[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul 08826, South Korea

[2] Seoul Natl Univ, ASRI, Seoul 08826, South Korea

来源：

2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA) | 2019年

基金：

新加坡国家研究基金会;

关键词：

D O I：

10.1109/icra.2019.8793505

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we propose a novel distributional reinforcement learning (RL) method which models the distribution of the sum of rewards using a mixture density network. Recently, it has been shown that modeling the randomness of the return distribution leads to better performance in Atari games and control tasks. Despite the success of the prior work, it has limitations which come from the use of a discrete distribution. First, it needs a projection step and softmax parametrization for the distribution, since it minimizes the KL divergence loss. Secondly, its performance depends on discretization hyperparameters such as the number of atoms and bounds of the support which require domain knowledge. We mitigate these problems with the proposed parameterization, a mixture of Gaussians. Furthermore, we propose a new distance metric called the Jensen-Tsallis distance, which allows the computation of the distance between two mixtures of Gaussians in a closed form. We have conducted various experiments to validate the proposed method, including Atari games and autonomous vehicle driving.

引用

页码：9791 / 9797

页数：7

共 50 条

[1] A Distributional Perspective on Multiagent Cooperation With Deep Reinforcement Learning
Huang, Liwei
Fu, Mingsheng
Rao, Ananya
Irissappane, Athirai A.
Zhang, Jie
Xu, Chengzhong
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 4246 - 4259
[2] Deep Reinforcement Learning with Distributional Semantic Rewards for Abstractive Summarization
Li, Siyao
Lei, Deren
Qin, Pengda
Wang, William Yang
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 6038 - 6044
[3] Gamma and vega hedging using deep distributional reinforcement learning
Cao, Jay
Chen, Jacky
Farghadani, Soroush
Hull, John
Poulos, Zissis
Wang, Zeyu
Yuan, Jun
FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
[4] Learning Mixture of Gaussians with Streaming Data
Raghunathan, Aditi
Jain, Prateek
Krishnaswamy, Ravishankar
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[5] Distributional Deep Reinforcement Learning-Based Emergency Frequency Control
Xie, Jian
Sun, Wei
IEEE TRANSACTIONS ON POWER SYSTEMS, 2022, 37 (04) : 2720 - 2730
[6] Parallel Distributional Prioritized Deep Reinforcement Learning for Unmanned Aerial Vehicles
Kolling, Alisson Henrique
Kich, Victor Augusto
de Jesus, Junior Costa
da Silva, Andressa Cavalcante
Grando, Ricardo Bedin
Jorge Drews-, Paulo Lilles, Jr.
Gamarra, Daniel F. T.
2023 LATIN AMERICAN ROBOTICS SYMPOSIUM, LARS, 2023 BRAZILIAN SYMPOSIUM ON ROBOTICS, SBR, AND 2023 WORKSHOP ON ROBOTICS IN EDUCATION, WRE, 2023, : 95 - 100
[7] Tight Bounds for Learning a Mixture of Two Gaussians
Hardt, Moritz
Price, Eric
STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2015, : 753 - 760
[8] Improved Algorithms for Properly Learning Mixture of Gaussians
Wu, Xuan
Xie, Changzhi
THEORETICAL COMPUTER SCIENCE (NCTCS 2018), 2018, 882 : 8 - 26
[9] Non-liner Learning for Mixture of Gaussians
Lin, Chih-Yang
Liu, Pin-Hsian
Muindisi, Tatenda
Yeh, Chia-Hung
Su, Po-Chyi
2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
[10] Distributional Reinforcement Learning with Ensembles
Lindenberg, Bjorn
Nordqvist, Jonas
Lindahl, Karl-Olof
ALGORITHMS, 2020, 13 (05)

← 1 2 3 4 5 →