Bid optimization using maximum entropy reinforcement learning

被引：3

作者：

Liu, Mengjuan ^{[1
]}

Liu, Jinyu ^{[1
]}

Hu, Zhengning ^{[1
]}

Ge, Yuchen ^{[1
]}

Nie, Xuyun ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Network & Data Secur Key Lab Sichuan Prov, Chengdu 610054, Peoples R China

来源：

NEUROCOMPUTING | 2022年 / 501卷

基金：

中国国家自然科学基金;

关键词：

Real-time bidding; Bidding strategy; Maximum entropy reinforcement learning;

D O I：

10.1016/j.neucom.2022.05.108

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Real-time bidding (RTB) has become a critical way for online advertising. It allows advertisers to display their ads by bidding on ad impressions. Therefore, advertisers in RTB always seek an optimal bidding strategy to improve their cost-efficiency. Unfortunately, it is challenging to optimize the bidding strategy at the granularity of impression due to the highly dynamic nature of the RTB environment. In this paper, we focus on optimizing the single advertiser's bidding strategy using a stochastic reinforcement learning (RL) algorithm. Firstly, we utilize a widely adopted linear bidding function to compute every impression's base price and optimize it with a mutable adjustment factor, thus making the bidding price conform to not only the impression's value to the advertiser but also the RTB environment. Secondly, we use the maximum entropy RL algorithm (Soft Actor-Critic) to optimize every impression's adjustment factor to overcome the deterministic RL algorithm's convergence problem. Finally, we evaluate the proposed strategy on a benchmark dataset (iPinYou), and the results demonstrate it obtained the most click numbers in 9 of 12 experiments compared to baselines. (c) 2022 Elsevier B.V. All rights reserved.

引用

页码：529 / 543

页数：15

共 50 条

[1] Maximum Entropy Reinforcement Learning with Evolution Strategies
Shi, Longxiang
Li, Shijian
Zheng, Qian
Cao, Longbing
Yang, Long
Pan, Gang
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[2] Continuous Deep Maximum Entropy Inverse Reinforcement Learning using online POMDP
Silva, Junior A. R.
Grassi Jr, Valdir
Wolf, Denis Fernando
2019 19TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS (ICAR), 2019, : 382 - 387
[3] Active Perception in Adversarial Scenarios using Maximum Entropy Deep Reinforcement Learning
Shen, Macheng
How, Jonathan P.
2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 3384 - 3390
[4] Maximum causal entropy inverse constrained reinforcement learning
Baert, Mattijs
Mazzaglia, Pietro
Leroux, Sam
Simoens, Pieter
MACHINE LEARNING, 2025, 114 (04)
[5] A Maximum Entropy Deep Reinforcement Learning Neural Tracker
Balaram, Shafa
Arulkumaran, Kai
Dai, Tianhong
Bharath, Anil Anthony
MACHINE LEARNING IN MEDICAL IMAGING (MLMI 2019), 2019, 11861 : 400 - 408
[6] Sparse online maximum entropy inverse reinforcement learning via proximal optimization and truncated gradient
Song L.
Li D.
Xu X.
Knowledge-Based Systems, 2022, 252
[7] A maximum entropy deep reinforcement learning method for sequential well placement optimization using multi-discrete action spaces
Zhang, Kai
Sun, Zifeng
Zhang, Liming
Xin, Guojing
Wang, Zhongzheng
Zhang, Wenjuan
Liu, Piyang
Yan, Xia
Zhang, Huaqing
Yang, Yongfei
Sun, Hai
GEOENERGY SCIENCE AND ENGINEERING, 2024, 240
[8] Learning the Car-following Behavior of Drivers Using Maximum Entropy Deep Inverse Reinforcement Learning
Zhou, Yang
Fu, Rui
Wang, Chang
JOURNAL OF ADVANCED TRANSPORTATION, 2020, 2020
[9] Maximum Entropy Semi-Supervised Inverse Reinforcement Learning
Audiffren, Julien
Valko, Michal
Lazaric, Alessandro
Ghavamzadeh, Mohammad
PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 3315 - 3321
[10] MaxEnt Dreamer: Maximum Entropy Reinforcement Learning with World Model
Ma, Hongying
Xue, Wuyang
Ying, Rendong
Liu, PeiLin
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,

← 1 2 3 4 5 →