Adaptive quantized online distributed mirror descent algorithm with Bandit feedback

被引:0
|
作者
Xie J.-R. [1 ]
Gao W.-H. [1 ]
Xie Y.-B. [1 ]
机构
[1] School of Mathematics, South China University of Technology, Guangdong, Guangzhou
基金
中国国家自然科学基金;
关键词
Bandit feedback; mirror descent algorithm; multi-agent systems; optimization; quantization;
D O I
10.7641/CTA.2022.20152
中图分类号
学科分类号
摘要
Online distributed optimization of multi-agent systems is often used to deal with the optimization problems in dynamic environments, and real-time data streams need to be transmitted between nodes. In many cases, each node cannot obtain all the information of the individual objective function (including gradient information), and there are communication constraints in the transmission of information between nodes. In this paper, considering the advantages of the mirror descent algorithm in the sense of non-Euclidean projection in processing high-dimensional data and large-scale online learning, the function value information of the individual objective function at two points is used to estimate the missing gradient information, and an adaptive quantizer is designed according to the property of mirror descent algorithm, and an adaptive quantized distributed online mirror descent algorithm based on the bandit feedback is proposed. Then the relationship between the quantization error bound and the regret bound is analyzed. The regret bound of the proposed algorithm can be obtained as O(√T) when the parameters are chosen appropriately. Finally, the effectiveness of the algorithm and theoretical results is verified by numerical simulations. © 2023 South China University of Technology. All rights reserved.
引用
收藏
页码:1774 / 1782
页数:8
相关论文
共 21 条
  • [1] RAJKUMAR R, LEE I, SHA L, Et al., Cyber-physical systems: The next computing revolution, Proceedings of the 47th Design Automation Conference, pp. 731-736, (2010)
  • [2] LEE S, KIM Y K, ZHENG Y, Et al., On model parallelization and scheduling strategies for distributed machine learning, Advances in Neural Information Processing Systems, 27, 4, pp. 2834-2842, (2014)
  • [3] DENG Z., Distributed algorithm design for aggregative games of euler-lagrange systems and its application to smart grids, IEEE Transactions on Cybernetics, 52, 8, pp. 8315-8325, (2021)
  • [4] DOAN T T, MAGULURI S T, ROMBERG J., Fast convergence rates of distributed subgradient methods with adaptive quantization, IEEE Transactions on Automatic Control, 66, 5, pp. 2191-2205, (2021)
  • [5] LIU J, YU Z, HO W C D., Distributed constrained optimization with delayed subgradient information over time-varying network under adaptive quantization, (2021)
  • [6] HAZAN E, AGARWAL A, KALE S., Logarithmic regret algorithms for online convex optimization, Machine Learning, 69, pp. 169-192, (2007)
  • [7] SHALEV-SHWARTZ S., Online learning and online convex optimization, Foundations and Trends in Machine Learning, 4, 2, pp. 107-194, (2012)
  • [8] SHAHRAMPOUR S, JADBABAIE A., An online optimization approach for multi-agent tracking of dynamic parameters in the presence of adversarial noise, American Control Conference (ACC), pp. 3306-3311, (2017)
  • [9] YAN F, SUNDARAM S, VISHWANATHAN S, Et al., Distributed autonomous online learning: Regrets and intrinsic privacy-preserving properties, IEEE Transactions on Knowledge & Data Engineering, 25, 11, pp. 2483-2493, (2013)
  • [10] SHAHRAMPOUR S, JADBABAIE A., Distributed online optimization in dynamic environments using mirror descent, IEEE Transactions on Automatic Control, 63, 3, pp. 714-725, (2018)