Model-Free Learning of Optimal Ergodic Policies in Wireless Systems

被引:7
|
作者
Kalogerias, Dionysios S. [1 ]
Eisen, Mark [3 ]
Pappas, George J. [2 ]
Ribeiro, Alejandro [2 ]
机构
[1] Michigan State Univ, Dept Elect & Comp Engn, E Lansing, MI 48824 USA
[2] Univ Penn, Dept Elect Syst Engn, Philadelphia, PA 19104 USA
[3] Intel Corp, Hillsboro, OR 97124 USA
关键词
Wireless communication; Resource management; Smoothing methods; Stochastic processes; Fading channels; Approximation algorithms; Signal processing algorithms; Wireless systems; stochastic resource allocation; zeroth-order optimization; constrained nonconvex optimization; deep learning; Lagrangian duality; strong duality; RESOURCE-ALLOCATION; POWER ALLOCATION; NETWORKS; OPTIMIZATION; ACCESS;
D O I
10.1109/TSP.2020.3030073
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Learning optimal resource allocation policies in wireless systems can be effectively achieved by formulating finite dimensional constrained programs which depend on system configuration, as well as the adopted learning parameterization. The interest here is in cases where system models are unavailable, prompting methods that probe the wireless system with candidate policies, and then use observed performance to determine better policies. This generic procedure is difficult because of the need to cull accurate gradient estimates out of these limited system queries. This article constructs and exploits smoothed surrogates of constrained ergodic resource allocation problems, the gradients of the former being representable exactly as averages of finite differences that can be obtained through limited system probing. Leveraging this unique property, we develop a new model-free primal-dual algorithm for learning optimal ergodic resource allocations, while we rigorously analyze the relationships between original policy search problems and their surrogates, in both primal and dual domains. First, we show that both primal and dual domain surrogates are uniformly consistent approximations of their corresponding original finite dimensional counterparts. Upon further assuming the use of near-universal policy parameterizations, we also develop explicit bounds on the gap between optimal values of initial, infinite dimensional resource allocation problems, and dual values of their parameterized smoothed surrogates. In fact, we show that this duality gap decreases at a linear rate relative to smoothing and universality parameters. Thus, it can be made arbitrarily small at will, also justifying our proposed primal-dual algorithmic recipe. Numerical simulations confirm the effectiveness of our approach.
引用
收藏
页码:6272 / 6286
页数:15
相关论文
共 50 条
  • [21] Model-free Optimal Coordinated Control for Rigidly Connected Dual-motor Systems Using Reinforcement Learning
    Yang C.
    Wang H.
    Zhao J.
    Zhongguo Dianji Gongcheng Xuebao/Proceedings of the Chinese Society of Electrical Engineering, 2024, 44 (09): : 3691 - 3701
  • [22] Quantized measurements in Q-learning based model-free optimal control
    Tiistola, Sini
    Ritala, Risto
    Vilkko, Matti
    IFAC PAPERSONLINE, 2020, 53 (02): : 1640 - 1645
  • [23] Model-free PAC Time-Optimal Control Synthesis with Reinforcement Learning
    Liu, Mengyu
    Lu, Pengyuan
    Chen, Xin
    Sokolsky, Oleg
    Lee, Insup
    Kong, Fanxin
    2024 22ND ACM-IEEE INTERNATIONAL SYMPOSIUM ON FORMAL METHODS AND MODELS FOR SYSTEM DESIGN, MEMOCODE 2024, 2024, : 34 - 45
  • [24] Hybrid-based model-free iterative learning control with optimal performance
    Kou, Zhicheng
    Sun, Jinggao
    Su, Guanghao
    Wang, Meng
    Yan, Huaicheng
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2023, 54 (10) : 2268 - 2280
  • [25] The model-free learning adaptive control of a class of SISO nonlinear systems
    Hou, ZS
    Huang, WH
    PROCEEDINGS OF THE 1997 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 1997, : 343 - 344
  • [26] A Probabilistic Model-free Approach in Learning Multivariate Noisy Linear Systems
    State, Luminita
    Paraschiv-Munteanu, Iuliana
    13TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2011), 2012, : 239 - 246
  • [27] Model-Free Learning Control of Nonlinear Discrete-Time Systems
    Sadegh, Nader
    2011 AMERICAN CONTROL CONFERENCE, 2011, : 3553 - 3558
  • [28] Model-free distributed optimal control for continuous-time linear systems
    Feng, Xinjun
    Zhao, Zhiyun
    IET CONTROL THEORY AND APPLICATIONS, 2022, 16 (16): : 1685 - 1695
  • [29] Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems
    Cai, Tianchi
    Bao, Shenliao
    Jiang, Jiyan
    Zhou, Shiji
    Zhang, Wenpeng
    Gu, Lihong
    Gu, Jinjie
    Zhang, Guannan
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2179 - 2183
  • [30] Model-Free Optimal Consensus Control of Networked Euler-Lagrange Systems
    Zhang, Huaipin
    Park, Ju H.
    Zhao, Wei
    IEEE ACCESS, 2019, 7 : 100771 - 100779