Differentially Private Reinforcement Learning with Linear Function Approximation

被引:4
|
作者
Zhou, Xingyu [1 ]
机构
[1] Wayne State Univ, 5050 Anthony Wayne Dr, Detroit, MI 48202 USA
关键词
reinforcement learning; differential privacy; linear function approximations; ALGORITHMS; DESIGN;
D O I
10.1145/3508028
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Motivated by the wide adoption of reinforcement learning (RL) in real-world personalized services , where users' sensitive and private information needs to be protected, we study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP). Compared to existing private RL algorithms that work only on tabular finite-state, finite-actions MDPs, we take the first step towards privacy-preserving learning in MDPs with large state and action spaces. Specifically, we consider MDPs with linear function approximation (in particular linear mixture MDPs) under the notion of joint differential privacy (JDP), where the RL agent is responsible for protecting users' sensitive data. We design two private RL algorithms that are based on value iteration and policy optimization, respectively, and show that they enjoy sub-linear regret performance while guaranteeing privacy protection. Moreover, the regret bounds are independent of the number of states, and scale at most logarithmically with the number of actions, making the algorithms suitable for privacy protection in nowadays large-scale personalized services. Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers, which not only generalizes previous results for non-private learning, but also serves as a building block for general private reinforcement learning.
引用
收藏
页数:27
相关论文
共 50 条
  • [31] On differentially private low rank approximation
    Kapralov, Michael
    Talwary, Kunal
    PROCEEDINGS OF THE TWENTY-FOURTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS (SODA 2013), 2013, : 1395 - 1414
  • [32] Reinforcement learning with function approximation for cooperative navigation tasks
    Melo, Francisco S.
    Ribeiro, M. Isabel
    2008 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-9, 2008, : 3321 - +
  • [33] Online Model Selection for Reinforcement Learning with Function Approximation
    Lee, Jonathan N.
    Pacchiano, Aldo
    Muthukumar, Vidya
    Kong, Weihao
    Brunskill, Emma
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [34] Reinforcement Learning With Function Approximation for Traffic Signal Control
    Prashanth, L. A.
    Bhatnagar, Shalabh
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2011, 12 (02) : 412 - 421
  • [35] Function Approximation for Reinforcement Learning Controller for Energy from
    Sarkar, Soumyendu
    Gundecha, Vineet
    Ghorbanpour, Sahand
    Shmakov, Alexander
    Babu, Ashwin Ramesh
    Naug, Avisek
    Pichard, Alexandre
    Cocho, Mathieu
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6201 - 6209
  • [36] Reinforcement learning with function approximation: Survey and practice experience
    Chizhov, Yuriy
    INTERNATIONAL CONFERENCE MODELLING OF BUSINESS, INDUSTRIAL AND TRANSPORT SYSTEMS, 2008, : 204 - 210
  • [37] SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation
    Dai, Bo
    Shaw, Albert
    Li, Lihong
    Xiao, Lin
    He, Niao
    Liu, Zhen
    Chen, Jianshu
    Song, Le
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [38] CBR for state value function approximation in reinforcement learning
    Gabel, T
    Riedmiller, M
    CASE-BASED REASONING RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2005, 3620 : 206 - 221
  • [39] Policy gradient methods for reinforcement learning with function approximation
    Sutton, RS
    McAllester, D
    Singh, S
    Mansour, Y
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1057 - 1063
  • [40] Multivariate Decision Tree Function Approximation for Reinforcement Learning
    Saghezchi, Hossein Bashashati
    Asadpour, Masoud
    NEURAL INFORMATION PROCESSING: THEORY AND ALGORITHMS, PT I, 2010, 6443 : 687 - 694