Optimal dynamic fixed-mix portfolios based on reinforcement learning with second order stochastic dominance

被引:2
|
作者
Consigli, Giorgio [1 ]
Gomez, Alvaro A. [1 ]
Zubelli, Jorge P. [1 ,2 ]
机构
[1] Khalifa Univ Sci & Technol, Dept Math, Abu Dhabi, U Arab Emirates
[2] ADIA LAB, Level 26,Al Khatem Tower, Abu Dhabi, U Arab Emirates
关键词
Fixed-mix portfolios; Stochastic dominance; Reinforcement learning; Actor-critic approach; Deep learning; Stochastic gradient; TRADING SYSTEM; RISK MEASURES; OPTIMIZATION;
D O I
10.1016/j.engappai.2024.108599
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose a reinforcement learning (RL) approach to address a multiperiod optimization problem in which a portfolio manager seeks an optimal constant proportion portfolio strategy by minimizing a tail risk measure consistent with second order stochastic dominance (SSD) principles. As a risk measure, we consider in particular the Interval Conditional Value -at -Risk (ICVaR) shown to be mathematically related to SSD principles. By including the ICVaR in the reward function of an RL method we show that an optimal fixed -mix policy can be derived as solution of short- to medium -term allocation problems through an accurate specification of the learning parameters under general statistical assumptions. The financial optimization problem, thus, carries several novel features and the article details the required steps to accommodate those features within a reinforcement learning architecture. The methodology is tested in- and out -of -sample on market data showing good performance relative to the SP500, adopted as benchmark policy.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Constrained Reinforcement Learning for Predictive Control in Real-Time Stochastic Dynamic Optimal Power Flow
    Wu, Tong
    Scaglione, Anna
    Arnold, Daniel
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2024, 39 (03) : 5077 - 5090
  • [32] Dynamic Grouping within Minimax Optimal Strategy for Stochastic Multi-ArmedBandits in Reinforcement Learning Recommendation
    Feng, Jiamei
    Zhu, Junlong
    Zhao, Xuhui
    Ji, Zhihang
    APPLIED SCIENCES-BASEL, 2024, 14 (08):
  • [33] Reinforcement Learning-based approach for dynamic vehicle routing problem with stochastic demand
    Zhou, Chenhao
    Ma, Jingxin
    Douge, Louis
    Chew, Ek Peng
    Lee, Loo Hay
    COMPUTERS & INDUSTRIAL ENGINEERING, 2023, 182
  • [34] Time-Varying Optimal Formation Control for Second-Order Multiagent Systems Based on Neural Network Observer and Reinforcement Learning
    Lan, Jie
    Liu, Yan-Jun
    Yu, Dengxiu
    Wen, Guoxing
    Tong, Shaocheng
    Liu, Lei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 3144 - 3155
  • [35] A reinforcement learning approach for distance-based dynamic tolling in the stochastic network environment
    Zhu, Feng
    Ukkusuri, Satish V.
    JOURNAL OF ADVANCED TRANSPORTATION, 2015, 49 (02) : 247 - 266
  • [36] A reinforcement learning-based scheme for direct adaptive optimal control of linear stochastic systems
    Wong, Wee Chin
    Lee, Jay H.
    OPTIMAL CONTROL APPLICATIONS & METHODS, 2010, 31 (04): : 365 - 374
  • [37] Reinforcement Learning based on Stochastic Dynamic Programming for Condition-based Maintenance of Deteriorating Production Processes
    Rasay, Hasan
    Naderkhani, Farnoosh
    Golmohammadi, Amir Mohammad
    2022 IEEE INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT (ICPHM), 2022, : 17 - 24
  • [38] Reinforcement Learning-Based Dynamic Order Recommendation for On-Demand Food Delivery
    Wang, Xing
    Wang, Ling
    Dong, Chenxin
    Ren, Hao
    Xing, Ke
    TSINGHUA SCIENCE AND TECHNOLOGY, 2024, 29 (02): : 356 - 367
  • [39] A new class of nonparametric tests for second-order stochastic dominance based on the Lorenz P-P plot
    Lando, Tommaso
    Legramanti, Sirio
    SCANDINAVIAN JOURNAL OF STATISTICS, 2025, 52 (01) : 480 - 512
  • [40] Intelligent dynamic control of stochastic economic lot scheduling by agent-based reinforcement learning
    Wang, Jiao
    Li, Xueping
    Zhu, Xiaoyan
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2012, 50 (16) : 4381 - 4395