Multi-step reward ensemble methods for adaptive stock trading

被引:2
|
作者
Zeng, Zhiyi [1 ]
Ma, Cong [2 ]
Chang, Xiangyu [3 ]
机构
[1] Hubei Normal Univ, Sch Math & Stat, Huangshi, Peoples R China
[2] Northwest Univ, Sch Econ & Management, Xian, Peoples R China
[3] Xi An Jiao Tong Univ, Sch Management, Ctr Intelligent Decis Making & Machine Learning, Xian, Peoples R China
关键词
Multi-step reward; Reward ensemble; Adaptive trading; Thompson sampling; VOLATILITY; RETURNS; RULES;
D O I
10.1016/j.eswa.2023.120547
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stock trading can be considered a Markov decision process that comes naturally to applying reinforcement learning (RL) to this field. Numerous studies have proposed various methods to combine stock trading with RL, where only one single reward function is used to fit the market. However, the market in the real world shows distinct patterns in different periods, such as bullish or bearish. A reward function in bullish periods may perform poorly in bearish periods. In our work, we construct several kinds of multi-step future-price-based reward functions (profit-based reward and regularized-based reward), considering that the market changes consistently. Moreover, we propose two ensemble rewards based on the greedy method (MSR-GME, the abbreviation for Multi-Step Rewards Greedy Method Ensemble) and Thompson sampling (MSR-TSE, the abbreviation for Multi-Step Rewards Thompson Sampling Ensemble) to help agents to make adaptive trading decisions under distinct market patterns. We conduct extensive experiments to verify the mechanisms and the superiority of our constructed reward functions from multiple aspects. The results show the two constructed single-reward functions outperform both the buy-and-hold strategy (B & H) and the historical-price-based rewards consistently to a large extent (for example, the profit-based reward achieves at most 7.3 times the Sortino ratio and 78.6% lower maximum drawdown than B & H). Moreover, the ensemble rewards can substantially improve strategy performance in achieving higher profits and lower risks (for example, MSR-TSE achieves at most 49.7 times profits and 8.85 times Sortino ratio than B & H). We also find that MSR-TSE is risk-averse, but MSR-GME is risk-aggressive, indicating that Thompson sampling is an intensely competitive ensemble method, especially in bearish markets.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] Anticipatory Classifier System with Average Reward Criterion in Discretized Multi-Step Environments
    Kozlowski, Norbert
    Unold, Olgierd
    APPLIED SCIENCES-BASEL, 2021, 11 (03): : 1 - 16
  • [32] Investigating Algorithmic Stock Market Trading Using Ensemble Machine Learning Methods
    Saifan, Ramzi
    Sharif, Khaled
    Abu-Ghazaleh, Mohammad
    Abdel-Majeed, Mohammad
    INFORMATICA-AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS, 2020, 44 (03): : 311 - 325
  • [33] Designing Internal Reward of Reinforcement Learning Agents in Multi-Step Dilemma Problem
    Ichikawa, Yoshihiro
    Takadama, Keiki
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2013, 17 (06) : 926 - 931
  • [34] Investigating algorithmic stock market trading using ensemble machine learning methods
    Saifan R.
    Sharif K.
    Abu-Ghazaleh M.
    Abdel-Majeed M.
    Informatica (Slovenia), 2020, 44 (03): : 311 - 325
  • [35] Fuzzy Cognitive Maps and Multi-step Gradient Methods for Prediction: Applications to Electricity Consumption and Stock Exchange Returns
    Papageorgiou, Elpiniki I.
    Poczeta, Katarzyna
    Yastrebov, Alexander
    Laspidou, Chrysi
    INTELLIGENT DECISION TECHNOLOGIES, 2015, 39 : 501 - 511
  • [36] Adaptive multi-step ahead forecasting of machine tool chatter
    Zhou, Xiaoqin
    Yu, Junyi
    Wang, Wencai
    Kong, Fansen
    Jixie Gongcheng Xuebao/Chinese Journal of Mechanical Engineering, 34 (05): : 55 - 59
  • [37] Digital Adaptive Calibration of Multi-Step Analog to Digital Converters
    Zjajo, Amir
    Barragan, Manuel J.
    de Gyvez, Jose Pineda
    JOURNAL OF LOW POWER ELECTRONICS, 2012, 8 (02) : 182 - 196
  • [38] One-Step and Multi-Step Ahead Stock Prediction Using Backpropagation Neural Networks
    Dong, Guanqun
    Fataliyev, Kamaladdin
    Wang, Lipo
    2013 9TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING (ICICS), 2013,
  • [39] Multi-Step Time Series Forecasting with an Ensemble of Varied Length Mixture Models
    Ouyang, Yicun
    Yin, Hujun
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2018, 28 (04)
  • [40] Multi-step ahead forecasting for electric power load using an ensemble model
    Zhao, Yubo
    Guo, Ni
    Chen, Wei
    Zhang, Hailan
    Guo, Bochao
    Shen, Jia
    Tian, Zijian
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 211