Multi-model Optimization with Discounted Reward and Budget Constraint

被引:1
|
作者
Shi, Jixuan [1 ]
Chen, Mei [1 ]
机构
[1] Beijing Neucloud Technol Co Ltd, Big Data Sci Lab, Beijing, Peoples R China
关键词
Multi-armed bandit; non-stationary environment; budget constraint; online learning;
D O I
10.1145/3208788.3208796
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiple arm bandit algorithm is widely used in gaming, gambling, policy generation, and artificial intelligence projects and gets more attention recently. In this paper, we explore non-stationary reward MAB problem with limited query budget. An upper confidence bound (UCB) based algorithm for the discounted MAB budget finite problem, which uses reward-cost ratio instead of arm rewards in discount empirical average. In order to estimate the instantaneous expected reward-cost ratio, the DUCB-BF policy averages past rewards with a discount factor giving more weight to recent observations. Theoretical regret bound is established with proof to be over-performed than other MAB algorithms. A real application on maintenance recovery models refinement is explored. Results comparison on 4 different MAB algorithms and DUCB-BF algorithm yields lowest regret as expected.
引用
收藏
页码:10 / 14
页数:5
相关论文
共 50 条
  • [1] Optimization of a Dynamic Supply Chain Model with Budget Constraint
    Badri, Hossein
    Afghahi, Babak
    WORLD CONGRESS ON ENGINEERING, WCE 2010, VOL III, 2010, : 2286 - 2291
  • [2] Multi-criteria multi-model design optimization
    Bestle, D
    Eberhard, P
    IUTAM SYMPOSIUM ON OPTIMIZATION OF MECHANICAL SYSTEMS, 1996, 43 : 33 - 40
  • [3] Optimal allocation of the EU carbon budget: A multi-model assessment
    Abrell, Jan
    Bilici, Sueheyb
    Blesl, Markus
    Fahl, Ulrich
    Kattelmann, Felix
    Kittel, Lena
    Kosch, Mirjam
    Luderer, Gunnar
    Marmullaku, Drin
    Pahle, Michael
    Pietzcker, Robert
    Rodrigues, Renato
    Siegle, Jonathan
    ENERGY STRATEGY REVIEWS, 2024, 51
  • [4] Linear multi-model time-optimization
    Boltyanski, V
    Poznyak, A
    OPTIMAL CONTROL APPLICATIONS & METHODS, 2002, 23 (03): : 141 - 161
  • [5] A multi-model approach to intravenous filter optimization
    Vassilevski, Y. V.
    Simakov, S. S.
    Kapranov, S. A.
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING, 2010, 26 (07) : 915 - 925
  • [6] Multi-model Transfer and Optimization for Cloze Task
    Tang, Jiahao
    Ling, Long
    Ma, Chenyu
    Zhang, Hanwen
    Huang, Jianqiang
    INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND ROBOTICS 2020, 2020, 11574
  • [7] Analysis of the global atmospheric background sulfur budget in a multi-model framework
    Brodowsky, Christina V.
    Sukhodolov, Timofei
    Chiodo, Gabriel
    Aquila, Valentina
    Bekki, Slimane
    Dhomse, Sandip S.
    Hoepfner, Michael
    Laakso, Anton
    Mann, Graham W.
    Niemeier, Ulrike
    Pitari, Giovanni
    Quaglia, Ilaria
    Rozanov, Eugene
    Schmidt, Anja
    Sekiya, Takashi
    Tilmes, Simone
    Timmreck, Claudia
    Vattioni, Sandro
    Visioni, Daniele
    Yu, Pengfei
    Zhu, Yunqian
    Peter, Thomas
    ATMOSPHERIC CHEMISTRY AND PHYSICS, 2024, 24 (09) : 5513 - 5548
  • [8] A Multi-Model Power Estimation Engine for Accuracy Optimization
    Klein, Felipe
    Araujo, G.
    Azevedo, Rodolfo
    Leao, Roberto
    dos Santos, Luiz C. V.
    ISLPED'07: PROCEEDINGS OF THE 2007 INTERNATIONAL SYMPOSIUM ON LOW POWER ELECTRONICS AND DESIGN, 2007, : 280 - 285
  • [9] A Global Optimization Approach to Robust Multi-Model Fitting
    Yu, Jin
    Chin, Tat-Jun
    Suter, David
    2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011,
  • [10] Optimization of multi-model ensemble forecasting of typhoon waves
    Pan, Shun-qi
    Fan, Yang-ming
    Chen, Jia-ming
    Kao, Chia-chuen
    WATER SCIENCE AND ENGINEERING, 2016, 9 (01) : 52 - 57