Multi-objective Bandits: Optimizing the Generalized Gini Index

被引：0

作者：

Busa-Fekete, Robert ^{[1
]}

Szorenyi, Balazs ^{[2
,3
]}

Weng, Paul ^{[4
,5
]}

Mannor, Shie ^{[3
]}

机构：

[1] Yahoo Res, New York, NY USA

[2] Hungarian Acad Sci & Univ Szeged, Res Grp AI, Szeged, Hungary

[3] Technion Israel Inst Technol, Haifa, Israel

[4] SYSU, SEIT, SYSU CMU JIE, Guangzhou, Peoples R China

[5] SYSU CMU JRI, Shunde, Peoples R China

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70 | 2017年 / 70卷

基金：

欧洲研究理事会;

关键词：

INEQUALITY; MODELS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study the multi-armed bandit (MAB) problem where the agent receives a vectorial feedback that encodes many possibly competing objectives to be optimized The goal of the agent is to find a policy, which can optimize these objectives simultaneously in a fair way. This multi-objective online optimization problem is formalized by using the Generalized Gini Index (GGI) aggregation function. We propose an online gradient descent algorithm which exploits the convexity of the GGI aggregation function, and controls the exploration in a careful way achieving a distribution-free regret (O) over tilde (T-1/2) with high probability. We test our algorithm on synthetic data as well as on an electric battery control problem where the goal is to trade off the use of the different cells of a battery in order to balance their respective degradation rates.

引用

页数：10

共 50 条

[1] Multi-Objective Generalized Linear Bandits
Lu, Shiyin
Wang, Guanghui
Hu, Yao
Zhang, Lijun
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3080 - 3086
[2] MULTI-OBJECTIVE CONTEXTUAL BANDITS WITH A DOMINANT OBJECTIVE
Tekin, Cem
Turgay, Eralp
2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2017,
[3] Contextual Bandits for Multi-Objective Recommender Systems
Lacerda, Anisio
2015 BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS 2015), 2015, : 68 - 73
[4] Blending Controllers via Multi-Objective Bandits
Gohari, Parham
Djeumou, Franck
Vinod, Abraham P.
Topcu, Ufuk
2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 88 - 95
[5] Multi-Objective X-Armed Bandits
Van Moffaert, Kristof
Van Vaerenbergh, Kevin
Vrancx, Peter
Nowe, Ann
PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 2331 - 2338
[6] Multi-Objective Ranked Bandits for Recommender Systems
Lacerda, Anisio
NEUROCOMPUTING, 2017, 246 : 12 - 24
[7] Sequential Learning of the Pareto Front for Multi-objective Bandits
Crepon, Elise
Garivier, Aurelien
Koolen, Wouter M.
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[8] PAC models in stochastic multi-objective multi-armed bandits
Drugan, Madalina M.
PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'17), 2017, : 409 - 416
[9] Designing multi-objective multi-armed bandits algorithms: a study
Drugan, Madalina M.
Nowe, Ann
2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
[10] Hierarchize Pareto Dominance in Multi-Objective Stochastic Linear Bandits
Cheng, Ji
Xue, Bo
Yi, Jiaxiang
Zhang, Qingfu
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 11489 - 11497

← 1 2 3 4 5 →