Transaction-aware inverse reinforcement learning for trading in stock markets

被引:1
|
作者
Sun, Qizhou [1 ]
Gong, Xueyuan [2 ]
Si, Yain-Whar [1 ]
机构
[1] Univ Macau, Dept Comp & Informat Sci, Ave Univ, Macau, Peoples R China
[2] Jinan Univ, Sch Intelligent Syst Sci & Engn, Skinny Dog Rd, Guangzhou, Peoples R China
关键词
Finance; Transaction-aware; Inverse reinforcement learning; Algorithmic trading;
D O I
10.1007/s10489-023-04959-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Training automated trading agents is a long-standing topic that has been widely discussed in artificial intelligence for the quantitative finance. Reinforcement learning (RL) is designed to solve the sequential decision-making tasks, like the stock trading. The output of the RL is the policy which can be presented as the probability values of the possible actions based on a given state. The policy is optimized by the reward function. However, even if the profit is considered as the natural reward function, a trading agent equipped with an RL model has several serious problems. Specifically, profit is only obtained after executing sell action, different profits exist at the same time step due to the varying-length transactions and the hold action deals with two opposite states, empty or nonempty position. To alleviate these shortcomings, in this paper, we introduce a new trading action called wait for the empty position status and design the appropriate rewards to all actions. Based on the new action space and reward functions, a novel approach named Transaction-aware Inverse Reinforcement Learning (TAIRL) is proposed. TAIRL rewards all trading actions for avoiding the reward bias and dilemma. TAIRL is evaluated by backtesting on 12 stocks of US, UK and China stock markets, and compared against other state-of-art RL methods and moving average trading methods. The experimental results show that the agent of TAIRL achieves the state-of-art performance in profitability and anti-risk ability.
引用
收藏
页码:28186 / 28206
页数:21
相关论文
共 50 条
  • [31] Stock trading strategy based on reinforcement learning with GRU network
    Yu, Jipu
    Lu, Xiaochen
    Pan, Anqi
    Shi, Jiaji
    PROCEEDINGS OF 2024 3RD INTERNATIONAL CONFERENCE ON CYBER SECURITY, ARTIFICIAL INTELLIGENCE AND DIGITAL ECONOMY, CSAIDE 2024, 2024, : 480 - 485
  • [32] A Multifaceted Approach to Stock Market Trading Using Reinforcement Learning
    Ansari, Yasmeen
    Gillani, Saira
    Bukhari, Maryam
    Lee, Byeongcheon
    Maqsood, Muazzam
    Rho, Seungmin
    IEEE ACCESS, 2024, 12 : 90041 - 90060
  • [33] Stock Market Trading Based on Market Sentiments and Reinforcement Learning
    Suhail, K. M. Ameen
    Sankar, Syam
    Kumar, Ashok S.
    Nestor, Tsafack
    Soliman, Naglaa F.
    Algarni, Abeer D.
    El-Shafai, Walid
    Abd El-Samie, Fathi E.
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 70 (01): : 935 - 950
  • [34] Stock trading with cycles: A financial application of ANFIS and reinforcement learning
    Tan, Zhiyong
    Quek, Chai
    Cheng, Philip Y. K.
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (05) : 4741 - 4755
  • [35] Spectrum Markets for Service Provider Spectrum Trading with Reinforcement Learning
    Abji, Nadeem
    Leon-Garcia, Alberto
    2011 IEEE 22ND INTERNATIONAL SYMPOSIUM ON PERSONAL INDOOR AND MOBILE RADIO COMMUNICATIONS (PIMRC), 2011, : 650 - 655
  • [36] Electronic trading in stock markets
    Stoll, HR
    JOURNAL OF ECONOMIC PERSPECTIVES, 2006, 20 (01): : 153 - 174
  • [37] Transaction costs, frequent trading, and stock prices
    Isaenko, Sergey
    JOURNAL OF FINANCIAL MARKETS, 2023, 64
  • [38] TRADING VOLUME AND TRANSACTION COSTS IN SPECIALIST MARKETS
    GEORGE, TJ
    KAUL, G
    NIMALENDRAN, M
    JOURNAL OF FINANCE, 1994, 49 (04): : 1489 - 1505
  • [39] Trading volume and transaction costs in futures markets
    Wang, GHK
    Yau, J
    Baptiste, T
    JOURNAL OF FUTURES MARKETS, 1997, 17 (07) : 757 - 780
  • [40] Modelling Stock Markets by Multi-agent Reinforcement Learning
    Johann Lussange
    Ivan Lazarevich
    Sacha Bourgeois-Gironde
    Stefano Palminteri
    Boris Gutkin
    Computational Economics, 2021, 57 : 113 - 147