Bid optimization using maximum entropy reinforcement learning

被引：3

作者：

Liu, Mengjuan ^{[1
]}

Liu, Jinyu ^{[1
]}

Hu, Zhengning ^{[1
]}

Ge, Yuchen ^{[1
]}

Nie, Xuyun ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Network & Data Secur Key Lab Sichuan Prov, Chengdu 610054, Peoples R China

来源：

NEUROCOMPUTING | 2022年 / 501卷

基金：

中国国家自然科学基金;

关键词：

Real-time bidding; Bidding strategy; Maximum entropy reinforcement learning;

D O I：

10.1016/j.neucom.2022.05.108

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Real-time bidding (RTB) has become a critical way for online advertising. It allows advertisers to display their ads by bidding on ad impressions. Therefore, advertisers in RTB always seek an optimal bidding strategy to improve their cost-efficiency. Unfortunately, it is challenging to optimize the bidding strategy at the granularity of impression due to the highly dynamic nature of the RTB environment. In this paper, we focus on optimizing the single advertiser's bidding strategy using a stochastic reinforcement learning (RL) algorithm. Firstly, we utilize a widely adopted linear bidding function to compute every impression's base price and optimize it with a mutable adjustment factor, thus making the bidding price conform to not only the impression's value to the advertiser but also the RTB environment. Secondly, we use the maximum entropy RL algorithm (Soft Actor-Critic) to optimize every impression's adjustment factor to overcome the deterministic RL algorithm's convergence problem. Finally, we evaluate the proposed strategy on a benchmark dataset (iPinYou), and the results demonstrate it obtained the most click numbers in 9 of 12 experiments compared to baselines. (c) 2022 Elsevier B.V. All rights reserved.

引用

页码：529 / 543

页数：15

共 50 条

[31] Off-policy asymptotic and adaptive maximum entropy deep reinforcement learning
Zhang, Huihui
Han, Xu
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, : 2417 - 2429
[32] A Modified Maximum Entropy Inverse Reinforcement Learning Approach for Microgrid hnergy Scheduling
Lin, Yanbin
Das, Avijit
Ni, Zhen
2023 IEEE POWER & ENERGY SOCIETY GENERAL MEETING, PESGM, 2023,
[33] Maximum Entropy Inverse Reinforcement Learning in Continuous State Spaces with Path Integrals
Aghasadeghi, Navid
Bretl, Timothy
2011 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, 2011, : 1561 - 1566
[34] A latent space method with maximum entropy deep reinforcement learning for data assimilation
Zhang, Jinding
Zhang, Kai
Wang, Zhongzheng
Zhou, Wensheng
Liu, Chen
Zhang, Liming
Ma, Xiaopeng
Liu, Piyang
Bian, Ziwei
Kang, Jinzheng
Yang, Yongfei
Yao, Jun
GEOENERGY SCIENCE AND ENGINEERING, 2024, 243
[35] Maximum Entropy Reinforcement Learning in Two-Player Perfect Information Games
Nakayashiki, Taichi
Kaneko, Tomoyuki
2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
[36] Unsupervised Discovery of Objects Physical Properties Through Maximum Entropy Reinforcement Learning
Chareyre, Maxime
Fournier, Pierre
Moras, Julien
Bourinet, Jean-Marc
Mezouar, Youcef
IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (04): : 3723 - 3730
[37] CONFORMAL THERAPY USING MAXIMUM-ENTROPY OPTIMIZATION
SANDHAM, WA
YONG, Y
DURRANI, TS
INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 1995, 6 (01) : 80 - 90
[38] USING REINFORCEMENT LEARNING OPTIMIZATION TO ACHIEVE MAXIMUM COST-EFFECTIVENESS OF WEIGHT LOSS INTERVENTIONS
Forman, Evan
Kerrigan, Stephanie
Juarascio, Adrienne
Butryn, Meghan
Moskow, Danielle M.
Manasse, Stephanie
ANNALS OF BEHAVIORAL MEDICINE, 2018, 52 : S86 - S86
[39] On Meeting a Maximum Delay Constraint Using Reinforcement Learning
Shafieirad, Hossein
Adve, Raviraj S.
IEEE ACCESS, 2022, 10 : 97897 - 97911
[40] Entropy-driven deep reinforcement learning for HVAC system optimization
Zhang, Chen
Tan, Zhi
JOURNAL OF RENEWABLE AND SUSTAINABLE ENERGY, 2025, 17 (01)

← 1 2 3 4 5 →