On Private and Robust Bandits

被引：0

作者：

Wu, Yulian ^{[1
]}

Zhou, Xingyu ^{[2
]}

Tao, Youming ^{[3
]}

Wang, Di ^{[1
]}

机构：

[1] KAUST, Thuwal, Saudi Arabia

[2] Wayne State Univ, Wayne, NJ USA

[3] Shandong Univ, Jinan, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

MULTIARMED BANDIT;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study private and robust multi-armed bandits (MABs), where the agent receives Huber's contaminated heavy-tailed rewards and meanwhile needs to ensure differential privacy. We consider both the finite k-th raw moment and the finite k-th central moment settings for heavy-tailed rewards distributions with k >= 2. We first present its minimax lower bound, characterizing the information-theoretic limit of regret with respect to privacy budget, contamination level, and heavy-tailedness. Then, we propose a meta-algorithm that builds on a private and robust mean estimation sub-routine PRM that essentially relies on reward truncation and the Laplace mechanism. For the above two different heavy-tailed settings, we give corresponding schemes of PRM, which enable us to achieve nearly-optimal regrets. Moreover, our two proposed truncation-based or histogram-based PRM schemes achieve the optimal trade-off between estimation accuracy, privacy and robustness. Finally, we support our theoretical results and show the effectiveness of our algorithms with experimental studies.

引用

页数：13

共 50 条

[31] Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks
Ding, Qin
Hsieh, Cho-Jui
Sharpnack, James
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
[32] (Nearly) Optimal Differentially Private Stochastic Multi-Arm Bandits
Mishra, Nikita
Thakurta, Abhradeep
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2015, : 592 - 601
[33] Factored DRO: Factored Distributionally Robust Policies for Contextual Bandits
Mu, Tong
Chandak, Yash
Hashimoto, Tatsunori
Brunskill, Emma
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[34] Robust Multi-Agent Bandits over Undirected Graphs
Vial D.
Shakkottai S.
Srikant R.
Performance Evaluation Review, 2023, 51 (01): : 67 - 68
[35] A Multiplier Bootstrap Approach to Designing Robust Algorithms for Contextual Bandits
Xie, Hong
Tang, Qiao
Zhu, Qingsheng
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (12) : 9887 - 9899
[36] Robust Multi-Agent Bandits Over Undirected Graphs
Vial, Daniel
Shakkottai, Sanjay
Srikant, R.
PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2022, 6 (03)
[37] Thompson Sampling for Robust Transfer in Multi-Task Bandits
Wang, Zhi
Zhang, Chicheng
Chaudhuri, Kamalika
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[38] Bias-Robust Bayesian Optimization via Dueling Bandits
Kirschner, Johannes
Krause, Andreas
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[39] Robust Risk-Averse Stochastic Multi-armed Bandits
Maillard, Odalric-Ambrym
ALGORITHMIC LEARNING THEORY (ALT 2013), 2013, 8139 : 218 - 233
[40] When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits
Azize, Achraf
Basu, Debabrota
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,

← 1 2 3 4 5 →