Bid optimization using maximum entropy reinforcement learning

被引:3
|
作者
Liu, Mengjuan [1 ]
Liu, Jinyu [1 ]
Hu, Zhengning [1 ]
Ge, Yuchen [1 ]
Nie, Xuyun [1 ]
机构
[1] Univ Elect Sci & Technol China, Network & Data Secur Key Lab Sichuan Prov, Chengdu 610054, Peoples R China
基金
中国国家自然科学基金;
关键词
Real-time bidding; Bidding strategy; Maximum entropy reinforcement learning;
D O I
10.1016/j.neucom.2022.05.108
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real-time bidding (RTB) has become a critical way for online advertising. It allows advertisers to display their ads by bidding on ad impressions. Therefore, advertisers in RTB always seek an optimal bidding strategy to improve their cost-efficiency. Unfortunately, it is challenging to optimize the bidding strategy at the granularity of impression due to the highly dynamic nature of the RTB environment. In this paper, we focus on optimizing the single advertiser's bidding strategy using a stochastic reinforcement learning (RL) algorithm. Firstly, we utilize a widely adopted linear bidding function to compute every impression's base price and optimize it with a mutable adjustment factor, thus making the bidding price conform to not only the impression's value to the advertiser but also the RTB environment. Secondly, we use the maximum entropy RL algorithm (Soft Actor-Critic) to optimize every impression's adjustment factor to overcome the deterministic RL algorithm's convergence problem. Finally, we evaluate the proposed strategy on a benchmark dataset (iPinYou), and the results demonstrate it obtained the most click numbers in 9 of 12 experiments compared to baselines. (c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:529 / 543
页数:15
相关论文
共 50 条
  • [41] State-Dependent Maximum Entropy Reinforcement Learning for Robot Long-Horizon Task Learning
    Zheng, Deshuai
    Yan, Jin
    Xue, Tao
    Liu, Yong
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2024, 110 (01)
  • [42] Car-Following Behavior Modeling With Maximum Entropy Deep Inverse Reinforcement Learning
    Nan, Jiangfeng
    Deng, Weiwen
    Zhang, Ruzheng
    Zhao, Rui
    Wang, Ying
    Ding, Juan
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (02): : 3998 - 4010
  • [43] State-Dependent Maximum Entropy Reinforcement Learning for Robot Long-Horizon Task Learning
    Deshuai Zheng
    Jin Yan
    Tao Xue
    Yong Liu
    Journal of Intelligent & Robotic Systems, 2024, 110
  • [44] MEERL: Maximum Experience Entropy Reinforcement Learning Method for Navigation and Control of Automated Vehicles
    Bi, Xin
    Weng, Caien
    Tong, Panpan
    Zhang, Bo
    Li, Zhichao
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 1103 - 1109
  • [45] Optimization of Obstacle Avoidance Using Reinforcement Learning
    Kominami, Keishi
    Takubo, Tomohito
    Ohara, Kenichi
    Mae, Yasushi
    Arai, Tatsuo
    2012 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION (SII), 2012, : 67 - 72
  • [46] Maximum diffusion reinforcement learning
    Berrueta, Thomas A.
    Pinosky, Allison
    Murphey, Todd D.
    NATURE MACHINE INTELLIGENCE, 2024, 6 (05) : 504 - 514
  • [47] Robot Control Optimization Using Reinforcement Learning
    Kai-Tai Song
    Wen-Yu Sun
    Journal of Intelligent and Robotic Systems, 1998, 21 : 221 - 238
  • [48] Robot control optimization using reinforcement learning
    Natl Chiao Tung Univ, Hsinchu, Taiwan
    J Intell Rob Syst Theor Appl, 3 (221-238):
  • [49] A Logic Optimization Method Using Reinforcement Learning
    Cai, Yuting
    Wu, Yue
    Yang, Xiaoyan
    Chu, Zhufei
    2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024, 2024, : 312 - 317
  • [50] Optimization of Reinforcement Learning Using Quantum Computation
    Ravish, Roopa
    Bhat, Nischal R.
    Nandakumar, N.
    Sagar, S.
    Sunil, Prasad B.
    Honnavalli, Prasad B.
    IEEE ACCESS, 2024, 12 : 179396 - 179417