Joint Strategy of Dynamic Ordering and Pricing for Competing Perishables with Q-Learning Algorithm

被引:2
|
作者
Zheng, Jiangbo [1 ]
Gan, Yanhong [2 ]
Liang, Ying [3 ]
Jiang, Qingqing [1 ]
Chang, Jiatai [1 ]
机构
[1] Jinan Univ, Sch Management, Guangzhou 510632, Guangdong, Peoples R China
[2] South China Univ Technol, Sch Business Adm, Guangzhou 510640, Peoples R China
[3] South China Normal Univ, Sch Econ & Management, Guangzhou 510006, Guangdong, Peoples R China
关键词
INVENTORY CONTROL; POLICIES; REPLENISHMENT; MANAGEMENT;
D O I
10.1155/2021/6643195
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We use Machine Learning (ML) to study firms' joint pricing and ordering decisions for perishables in a dynamic loop. The research assumption is as follows: at the beginning of each period, the retailer prices both the new and old products and determines how many new products to order, while at the end of each period, the retailer decides how much remaining inventory should be carried over to the next period. The objective is to determine a joint pricing, ordering, and disposal strategy to maximize the total expected discounted profit. We establish a decision model based on Markov processes and use the Q-learning algorithm to obtain a near-optimal policy. From numerical analysis, we find that (i) the optimal number of old products carried over to the next period depends on the upper quantitative bound for old inventory; (ii) the optimal prices for new products are positively related to potential demand but negatively related to the decay rate, while the optimal prices for old products have a positive relationship with both; and (iii) ordering decisions are unrelated to the quantity of old products. When the decay rate is low or the variable ordering cost is high, the optimal orders exhibit a trapezoidal decline as the quantity of new products increases.
引用
收藏
页数:19
相关论文
共 50 条
  • [11] Q-learning with heterogeneous update strategy
    Tan, Tao
    Xie, Hong
    Feng, Liang
    INFORMATION SCIENCES, 2024, 656
  • [12] Q-Learning Based Dynamic Channel Assignment Algorithm in Cognitive Radio
    Wang, Huahua
    Wei, Yang
    Long, Yin
    ELECTRONIC INFORMATION AND ELECTRICAL ENGINEERING, 2012, 19 : 127 - 131
  • [13] Enhanced Q-Learning Algorithm for Dynamic Power Management with Performance Constraint
    Liu, Wei
    Tan, Ying
    Qiu, Qinru
    2010 DESIGN, AUTOMATION & TEST IN EUROPE (DATE 2010), 2010, : 602 - 605
  • [14] Dynamic Path Planning of a Mobile Robot with Improved Q-Learning algorithm
    Li, Siding
    Xu, Xin
    Zuo, Lei
    2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 409 - 414
  • [15] A smoothed Q-learning algorithm for estimating optimal dynamic treatment regimes
    Fan, Yanqin
    He, Ming
    Su, Liangjun
    Zhou, Xiao-Hua
    SCANDINAVIAN JOURNAL OF STATISTICS, 2019, 46 (02) : 446 - 469
  • [16] ENHANCEMENTS OF FUZZY Q-LEARNING ALGORITHM
    Glowaty, Grzegorz
    COMPUTER SCIENCE-AGH, 2005, 7 : 77 - 87
  • [17] Q-Learning Algorithm for Joint Computation Offloading and Resource Allocation in Edge Cloud
    Dab, Boutheina
    Aitsaadi, Nadjib
    Langar, Rami
    2019 IFIP/IEEE SYMPOSIUM ON INTEGRATED NETWORK AND SERVICE MANAGEMENT (IM), 2019,
  • [18] An analysis of the pheromone Q-learning algorithm
    Monekosso, N
    Remagnino, P
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2002, PROCEEDINGS, 2002, 2527 : 224 - 232
  • [19] A Weighted Smooth Q-Learning Algorithm
    Vijesh, V. Antony
    Shreyas, S. R.
    IEEE CONTROL SYSTEMS LETTERS, 2025, 9 : 21 - 26
  • [20] An improved immune Q-learning algorithm
    Ji, Zhengqiao
    Wu, Q. M. Jonathan
    Sid-Ahmed, Maher
    2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 3330 - +