Multi-objective Bandits: Optimizing the Generalized Gini Index

被引:0
|
作者
Busa-Fekete, Robert [1 ]
Szorenyi, Balazs [2 ,3 ]
Weng, Paul [4 ,5 ]
Mannor, Shie [3 ]
机构
[1] Yahoo Res, New York, NY USA
[2] Hungarian Acad Sci & Univ Szeged, Res Grp AI, Szeged, Hungary
[3] Technion Israel Inst Technol, Haifa, Israel
[4] SYSU, SEIT, SYSU CMU JIE, Guangzhou, Peoples R China
[5] SYSU CMU JRI, Shunde, Peoples R China
基金
欧洲研究理事会;
关键词
INEQUALITY; MODELS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the multi-armed bandit (MAB) problem where the agent receives a vectorial feedback that encodes many possibly competing objectives to be optimized The goal of the agent is to find a policy, which can optimize these objectives simultaneously in a fair way. This multi-objective online optimization problem is formalized by using the Generalized Gini Index (GGI) aggregation function. We propose an online gradient descent algorithm which exploits the convexity of the GGI aggregation function, and controls the exploration in a careful way achieving a distribution-free regret (O) over tilde (T-1/2) with high probability. We test our algorithm on synthetic data as well as on an electric battery control problem where the goal is to trade off the use of the different cells of a battery in order to balance their respective degradation rates.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Multi-Objective Generalized Linear Bandits
    Lu, Shiyin
    Wang, Guanghui
    Hu, Yao
    Zhang, Lijun
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3080 - 3086
  • [2] MULTI-OBJECTIVE CONTEXTUAL BANDITS WITH A DOMINANT OBJECTIVE
    Tekin, Cem
    Turgay, Eralp
    2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2017,
  • [3] Contextual Bandits for Multi-Objective Recommender Systems
    Lacerda, Anisio
    2015 BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS 2015), 2015, : 68 - 73
  • [4] Blending Controllers via Multi-Objective Bandits
    Gohari, Parham
    Djeumou, Franck
    Vinod, Abraham P.
    Topcu, Ufuk
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 88 - 95
  • [5] Multi-Objective X-Armed Bandits
    Van Moffaert, Kristof
    Van Vaerenbergh, Kevin
    Vrancx, Peter
    Nowe, Ann
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 2331 - 2338
  • [6] Multi-Objective Ranked Bandits for Recommender Systems
    Lacerda, Anisio
    NEUROCOMPUTING, 2017, 246 : 12 - 24
  • [7] Sequential Learning of the Pareto Front for Multi-objective Bandits
    Crepon, Elise
    Garivier, Aurelien
    Koolen, Wouter M.
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [8] PAC models in stochastic multi-objective multi-armed bandits
    Drugan, Madalina M.
    PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'17), 2017, : 409 - 416
  • [9] Designing multi-objective multi-armed bandits algorithms: a study
    Drugan, Madalina M.
    Nowe, Ann
    2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
  • [10] Hierarchize Pareto Dominance in Multi-Objective Stochastic Linear Bandits
    Cheng, Ji
    Xue, Bo
    Yi, Jiaxiang
    Zhang, Qingfu
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10, 2024, : 11489 - 11497