Joint Strategy of Dynamic Ordering and Pricing for Competing Perishables with Q-Learning Algorithm

被引：2

作者：

Zheng, Jiangbo ^{[1
]}

Gan, Yanhong ^{[2
]}

Liang, Ying ^{[3
]}

Jiang, Qingqing ^{[1
]}

Chang, Jiatai ^{[1
]}

机构：

[1] Jinan Univ, Sch Management, Guangzhou 510632, Guangdong, Peoples R China

[2] South China Univ Technol, Sch Business Adm, Guangzhou 510640, Peoples R China

[3] South China Normal Univ, Sch Econ & Management, Guangzhou 510006, Guangdong, Peoples R China

来源：

WIRELESS COMMUNICATIONS & MOBILE COMPUTING | 2021年 / 2021卷

关键词：

INVENTORY CONTROL; POLICIES; REPLENISHMENT; MANAGEMENT;

D O I：

10.1155/2021/6643195

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We use Machine Learning (ML) to study firms' joint pricing and ordering decisions for perishables in a dynamic loop. The research assumption is as follows: at the beginning of each period, the retailer prices both the new and old products and determines how many new products to order, while at the end of each period, the retailer decides how much remaining inventory should be carried over to the next period. The objective is to determine a joint pricing, ordering, and disposal strategy to maximize the total expected discounted profit. We establish a decision model based on Markov processes and use the Q-learning algorithm to obtain a near-optimal policy. From numerical analysis, we find that (i) the optimal number of old products carried over to the next period depends on the upper quantitative bound for old inventory; (ii) the optimal prices for new products are positively related to potential demand but negatively related to the decay rate, while the optimal prices for old products have a positive relationship with both; and (iii) ordering decisions are unrelated to the quantity of old products. When the decay rate is low or the variable ordering cost is high, the optimal orders exhibit a trapezoidal decline as the quantity of new products increases.

引用

页数：19

共 50 条

[21] Adaptive job shop scheduling strategy based on weighted Q-learning algorithm
Yu-Fang Wang
Journal of Intelligent Manufacturing, 2020, 31 : 417 - 432
[22] Adaptive job shop scheduling strategy based on weighted Q-learning algorithm
Wang, Yu-Fang
JOURNAL OF INTELLIGENT MANUFACTURING, 2020, 31 (02) : 417 - 432
[23] Adaptive dynamic scheduling strategy in knowledgeable manufacturing based on improved Q-learning
Wang, Yu-Fang
Yan, Hong-Sen
Kongzhi yu Juece/Control and Decision, 2015, 30 (11): : 1930 - 1936
[24] DYNAMIC MAINTENANCE STRATEGY WITH Q-LEARNING FOR WORKSTATIONS IN A FLOW LINE MANUFACTURING SYSTEM
Kamarthi, Sagar
Zeid, Abe
Ozbek, Yusuf
PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, DETC 2010, VOL 3, A AND B, 2010, : 989 - +
[25] Reinforcement learning inspired forwarding strategy for information centric networks using Q-learning algorithm
Delvadia, Krishna
Dutta, Nitul
INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2024, 37 (06)
[26] Application of Improved Q-Learning Algorithm in Dynamic Path Planning for Aircraft at Airports
Xiang, Zheng
Sun, Heyang
Zhang, Jiahao
IEEE ACCESS, 2023, 11 : 107892 - 107905
[27] Dynamic Pricing Scheme for IaaS Cloud Platform Based on Load Balancing: A Q-learning Approach
Ren, Jiali
Pang, Lijuan
Cheng, Yan
PROCEEDINGS OF 2017 8TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2017), 2017, : 806 - 810
[28] Dynamic learning, pricing, and ordering by a censored newsvendor
Bisi, Arnab
Dada, Maqbool
NAVAL RESEARCH LOGISTICS, 2007, 54 (04) : 448 - 461
[29] Solving a Joint Pricing and Inventory Control Problem for Perishables via Deep Reinforcement Learning
Wang, Rui
Gan, Xianghua
Li, Qing
Yan, Xiao
COMPLEXITY, 2021, 2021
[30] Exponential Moving Average Q-Learning Algorithm
Awheda, Mostafa D.
Schwartz, Howard M.
PROCEEDINGS OF THE 2013 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2013, : 31 - 38

← 1 2 3 4 5 →