Joint Strategy of Dynamic Ordering and Pricing for Competing Perishables with Q-Learning Algorithm

被引:2
|
作者
Zheng, Jiangbo [1 ]
Gan, Yanhong [2 ]
Liang, Ying [3 ]
Jiang, Qingqing [1 ]
Chang, Jiatai [1 ]
机构
[1] Jinan Univ, Sch Management, Guangzhou 510632, Guangdong, Peoples R China
[2] South China Univ Technol, Sch Business Adm, Guangzhou 510640, Peoples R China
[3] South China Normal Univ, Sch Econ & Management, Guangzhou 510006, Guangdong, Peoples R China
关键词
INVENTORY CONTROL; POLICIES; REPLENISHMENT; MANAGEMENT;
D O I
10.1155/2021/6643195
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We use Machine Learning (ML) to study firms' joint pricing and ordering decisions for perishables in a dynamic loop. The research assumption is as follows: at the beginning of each period, the retailer prices both the new and old products and determines how many new products to order, while at the end of each period, the retailer decides how much remaining inventory should be carried over to the next period. The objective is to determine a joint pricing, ordering, and disposal strategy to maximize the total expected discounted profit. We establish a decision model based on Markov processes and use the Q-learning algorithm to obtain a near-optimal policy. From numerical analysis, we find that (i) the optimal number of old products carried over to the next period depends on the upper quantitative bound for old inventory; (ii) the optimal prices for new products are positively related to potential demand but negatively related to the decay rate, while the optimal prices for old products have a positive relationship with both; and (iii) ordering decisions are unrelated to the quantity of old products. When the decay rate is low or the variable ordering cost is high, the optimal orders exhibit a trapezoidal decline as the quantity of new products increases.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] Autonomous algorithmic collusion: Q-learning under sequential pricing
    Klein, Timo
    RAND JOURNAL OF ECONOMICS, 2021, 52 (03): : 538 - 558
  • [32] Generating Test Cases for Q-Learning Algorithm
    Kumaresan, Lavanya
    Chamundeswari, A.
    2013 FOURTH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATIONS AND NETWORKING TECHNOLOGIES (ICCCNT), 2013,
  • [33] Trading ETFs with Deep Q-Learning Algorithm
    Hong, Shao-Yan
    Liu, Chien-Hung
    Chen, Woei-Kae
    You, Shingchern D.
    2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TAIWAN), 2020,
  • [34] Cognitive networks QoS multi-objective strategy based on Q-learning algorithm
    Wang, B. (wangbowx@163.com), 1600, Advanced Institute of Convergence Information Technology, Myoungbo Bldg 3F,, Bumin-dong 1-ga, Seo-gu, Busan, 602-816, Korea, Republic of (07):
  • [35] Q-learning algorithm for optimal multilevel thresholding
    Yin, PY
    IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III, 2001, : 335 - 340
  • [36] An ARM-based Q-learning algorithm
    Hsu, Yuan-Pao
    Hwang, Kao-Shing
    Lin, Hsin-Yi
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF CONTEMPORARY INTELLIGENT COMPUTING TECHNIQUES, 2007, 2 : 11 - +
  • [37] Optimizing Q-Learning with K-FAC AlgorithmOptimizing Q-Learning with K-FAC Algorithm
    Beltiukov, Roman
    ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS (AIST 2019), 2020, 1086 : 3 - 8
  • [38] Dynamic Choice of State Abstraction in Q-Learning
    Tamassia, Marco
    Zambetta, Fabio
    Raffe, William L.
    Mueller, Florian 'Floyd'
    Li, Xiaodong
    ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 46 - 54
  • [39] Efficient implementation of dynamic fuzzy Q-learning
    Deng, C
    Er, MJ
    ICICS-PCM 2003, VOLS 1-3, PROCEEDINGS, 2003, : 1854 - 1858
  • [40] Q-learning with Experience Replay in a Dynamic Environment
    Pieters, Mathijs
    Wiering, Marco A.
    PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,