Bid optimization using maximum entropy reinforcement learning

被引：3

作者：

Liu, Mengjuan ^{[1
]}

Liu, Jinyu ^{[1
]}

Hu, Zhengning ^{[1
]}

Ge, Yuchen ^{[1
]}

Nie, Xuyun ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Network & Data Secur Key Lab Sichuan Prov, Chengdu 610054, Peoples R China

来源：

NEUROCOMPUTING | 2022年 / 501卷

基金：

中国国家自然科学基金;

关键词：

Real-time bidding; Bidding strategy; Maximum entropy reinforcement learning;

D O I：

10.1016/j.neucom.2022.05.108

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Real-time bidding (RTB) has become a critical way for online advertising. It allows advertisers to display their ads by bidding on ad impressions. Therefore, advertisers in RTB always seek an optimal bidding strategy to improve their cost-efficiency. Unfortunately, it is challenging to optimize the bidding strategy at the granularity of impression due to the highly dynamic nature of the RTB environment. In this paper, we focus on optimizing the single advertiser's bidding strategy using a stochastic reinforcement learning (RL) algorithm. Firstly, we utilize a widely adopted linear bidding function to compute every impression's base price and optimize it with a mutable adjustment factor, thus making the bidding price conform to not only the impression's value to the advertiser but also the RTB environment. Secondly, we use the maximum entropy RL algorithm (Soft Actor-Critic) to optimize every impression's adjustment factor to overcome the deterministic RL algorithm's convergence problem. Finally, we evaluate the proposed strategy on a benchmark dataset (iPinYou), and the results demonstrate it obtained the most click numbers in 9 of 12 experiments compared to baselines. (c) 2022 Elsevier B.V. All rights reserved.

引用

页码：529 / 543

页数：15

共 50 条

[41] State-Dependent Maximum Entropy Reinforcement Learning for Robot Long-Horizon Task Learning
Zheng, Deshuai
Yan, Jin
Xue, Tao
Liu, Yong
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2024, 110 (01)
[42] Car-Following Behavior Modeling With Maximum Entropy Deep Inverse Reinforcement Learning
Nan, Jiangfeng
Deng, Weiwen
Zhang, Ruzheng
Zhao, Rui
Wang, Ying
Ding, Juan
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (02): : 3998 - 4010
[43] State-Dependent Maximum Entropy Reinforcement Learning for Robot Long-Horizon Task Learning
Deshuai Zheng
Jin Yan
Tao Xue
Yong Liu
Journal of Intelligent & Robotic Systems, 2024, 110
[44] MEERL: Maximum Experience Entropy Reinforcement Learning Method for Navigation and Control of Automated Vehicles
Bi, Xin
Weng, Caien
Tong, Panpan
Zhang, Bo
Li, Zhichao
2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 1103 - 1109
[45] Optimization of Obstacle Avoidance Using Reinforcement Learning
Kominami, Keishi
Takubo, Tomohito
Ohara, Kenichi
Mae, Yasushi
Arai, Tatsuo
2012 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION (SII), 2012, : 67 - 72
[46] Maximum diffusion reinforcement learning
Berrueta, Thomas A.
Pinosky, Allison
Murphey, Todd D.
NATURE MACHINE INTELLIGENCE, 2024, 6 (05) : 504 - 514
[47] Robot Control Optimization Using Reinforcement Learning
Kai-Tai Song
Wen-Yu Sun
Journal of Intelligent and Robotic Systems, 1998, 21 : 221 - 238
[48] Robot control optimization using reinforcement learning
Natl Chiao Tung Univ, Hsinchu, Taiwan
J Intell Rob Syst Theor Appl, 3 (221-238):
[49] A Logic Optimization Method Using Reinforcement Learning
Cai, Yuting
Wu, Yue
Yang, Xiaoyan
Chu, Zhufei
2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024, 2024, : 312 - 317
[50] Optimization of Reinforcement Learning Using Quantum Computation
Ravish, Roopa
Bhat, Nischal R.
Nandakumar, N.
Sagar, S.
Sunil, Prasad B.
Honnavalli, Prasad B.
IEEE ACCESS, 2024, 12 : 179396 - 179417

← 1 2 3 4 5 →