共 28 条
- [21] Schulman J, Moritz P, Levine S, Et al., High-dimensional continuous control using generalized advantage estimation [J], (2015)
- [22] Cong Fei, Wang Bin, Yuzheng Zhuang, Et al., Triple-gail: A multi-modal imitation learning framework with generative adversarial nets [C], Proc of the 29th Int Joint Conf on Artificial Intelligence, pp. 2929-2935, (2020)
- [23] Sion M., On general minimax theorems[J], Pacific Journal of Mathematics, 8, 1, pp. 171-176, (1958)
- [24] Todorov E, Erez T, Tassa Y., MuJoCo: A physics engine for model-based control [C], Proc of the 2012 IEEE/RSJ Int Conf on Intelligent Robots and Systems, pp. 5026-5033, (2012)
- [25] Jjiacheng Zhu, Chong Jiang, Tac-gail: A multi-modal imitation learning method [C], Proc of the 27th Int Conf on Neural Information Processing, pp. 688-699, (2020)
- [26] Haarnoja T, Zhou A, Abbeel P, Et al., Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor [C], Proc of the 35th Int Conf on Machine Learning, pp. 1861-1870, (2018)
- [27] Schulman J, Levine S, Abbeel P, Et al., Trust region policy optimization [C], Proc of the 32nd Int Conf on Machine Learning, pp. 1889-1897, (2015)
- [28] Hongwei Tan, Linyong Zhou, Guodong Wang, Et al., Instability analysis for generative adversarial networks and its solving techniques[J], SCIENTIA SINICA Informationis, 51, 4, (2021)