Differentially Private Reinforcement Learning with Linear Function Approximation

被引:4
|
作者
Zhou, Xingyu [1 ]
机构
[1] Wayne State Univ, 5050 Anthony Wayne Dr, Detroit, MI 48202 USA
关键词
reinforcement learning; differential privacy; linear function approximations; ALGORITHMS; DESIGN;
D O I
10.1145/3508028
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Motivated by the wide adoption of reinforcement learning (RL) in real-world personalized services , where users' sensitive and private information needs to be protected, we study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP). Compared to existing private RL algorithms that work only on tabular finite-state, finite-actions MDPs, we take the first step towards privacy-preserving learning in MDPs with large state and action spaces. Specifically, we consider MDPs with linear function approximation (in particular linear mixture MDPs) under the notion of joint differential privacy (JDP), where the RL agent is responsible for protecting users' sensitive data. We design two private RL algorithms that are based on value iteration and policy optimization, respectively, and show that they enjoy sub-linear regret performance while guaranteeing privacy protection. Moreover, the regret bounds are independent of the number of states, and scale at most logarithmically with the number of actions, making the algorithms suitable for privacy protection in nowadays large-scale personalized services. Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers, which not only generalizes previous results for non-private learning, but also serves as a building block for general private reinforcement learning.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] Differentially Private Reinforcement Learning
    Ma, Pingchuan
    Wang, Zhiqiang
    Zhang, Le
    Wang, Ruming
    Zou, Xiaoxiang
    Yang, Tao
    INFORMATION AND COMMUNICATIONS SECURITY (ICICS 2019), 2020, 11999 : 668 - 683
  • [2] Distributional reinforcement learning with linear function approximation
    Bellemare, Marc G.
    Le Roux, Nicolas
    Castro, Pablo Samuel
    Moitra, Subhodeep
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [3] Parallel reinforcement learning with linear function approximation
    Grounds, Matthew
    Kudenko, Daniel
    ADAPTIVE AGENTS AND MULTI-AGENT SYSTEMS, 2008, 4865 : 60 - 74
  • [4] Safe Reinforcement Learning with Linear Function Approximation
    Amani, Sanae
    Thrampoulidis, Christos
    Yang, Lin F.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [5] Exponential Hardness of Reinforcement Learning with Linear Function Approximation
    Kane, Daniel
    Liu, Sihan
    Lovett, Shachar
    Mahajan, Gaurav
    Szepesvari, Csaba
    Weisz, Gellert
    THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
  • [6] Logarithmic Regret for Reinforcement Learning with Linear Function Approximation
    He, Jiafan
    Zhou, Dongruo
    Gu, Quanquan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [7] Provably Efficient Reinforcement Learning with Linear Function Approximation
    Jin, Chi
    Yang, Zhuoran
    Wang, Zhaoran
    Jordan, Michael, I
    MATHEMATICS OF OPERATIONS RESEARCH, 2023, 48 (03) : 1496 - 1521
  • [8] Exponential Hardness of Reinforcement Learning with Linear Function Approximation
    Kane, Daniel
    Liu, Sihan
    Lovett, Shachar
    Mahajan, Gaurav
    Szepesvári, Csaba
    Weisz, Gellért
    Proceedings of Machine Learning Research, 2023, 195 : 1588 - 1617
  • [9] On Reward-Free Reinforcement Learning with Linear Function Approximation
    Wang, Ruosong
    Du, Simon S.
    Yang, Lin F.
    Salakhutdinov, Ruslan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [10] Nearly Minimax Optimal Reinforcement Learning with Linear Function Approximation
    Hu, Pihe
    Chen, Yu
    Huang, Longbo
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,