A smoothed Q-learning algorithm for estimating optimal dynamic treatment regimes

被引:2
|
作者
Fan, Yanqin [1 ]
He, Ming [2 ]
Su, Liangjun [3 ]
Zhou, Xiao-Hua [4 ,5 ]
机构
[1] Univ Washington, Dept Econ, Seattle, WA 98195 USA
[2] Univ Technol Sydney, Econ Discipline Grp, Ultimo, Australia
[3] Singapore Management Univ, Sch Econ, Singapore, Singapore
[4] Peking Univ, Beijing Int Ctr Math Res, Beijing 100871, Peoples R China
[5] Peking Univ, Sch Publ Hlth, Beijing 100191, Peoples R China
基金
中国国家自然科学基金;
关键词
asymptotic normality; exceptional law; optimal smoothing parameter; sequential randomization; Wald-type inference; TECHNICAL CHALLENGES; INFERENCE;
D O I
10.1111/sjos.12359
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we propose a smoothed Q-learning algorithm for estimating optimal dynamic treatment regimes. In contrast to the Q-learning algorithm in which nonregular inference is involved, we show that, under assumptions adopted in this paper, the proposed smoothed Q-learning estimator is asymptotically normally distributed even when the Q-learning estimator is not and its asymptotic variance can be consistently estimated. As a result, inference based on the smoothed Q-learning estimator is standard. We derive the optimal smoothing parameter and propose a data-driven method for estimating it. The finite sample properties of the smoothed Q-learning estimator are studied and compared with several existing estimators including the Q-learning estimator via an extensive simulation study. We illustrate the new method by analyzing data from the Clinical Antipsychotic Trials of Intervention Effectiveness-Alzheimer's Disease (CATIE-AD) study.
引用
收藏
页码:446 / 469
页数:24
相关论文
共 50 条
  • [1] Weighted Q-learning for optimal dynamic treatment regimes with nonignorable missing covariates
    Sun, Jian
    Fu, Bo
    Su, Li
    BIOMETRICS, 2025, 81 (01)
  • [2] Q-learning for estimating optimal dynamic treatment rules from observational data
    Moodie, Erica E. M.
    Chakraborty, Bibhas
    Kramer, Michael S.
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2012, 40 (04): : 629 - 645
  • [3] Q- and A-Learning Methods for Estimating Optimal Dynamic Treatment Regimes
    Schulte, Phillip J.
    Tsiatis, Anastasios A.
    Laber, Eric B.
    Davidian, Marie
    STATISTICAL SCIENCE, 2014, 29 (04) : 640 - 661
  • [4] Q-Learning in Dynamic Treatment Regimes With Misclassified Binary Outcome
    Liu, Dan
    He, Wenqing
    STATISTICS IN MEDICINE, 2024, 43 (30) : 5885 - 5897
  • [5] Accommodating misclassification effects on optimizing dynamic treatment regimes with Q-learning
    Charvadeh, Yasin Khadem
    Yi, Grace Y.
    STATISTICS IN MEDICINE, 2024, 43 (03) : 578 - 605
  • [6] Q-learning algorithm for optimal multilevel thresholding
    Yin, PY
    IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III, 2001, : 335 - 340
  • [7] New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes
    Zhao, Ying-Qi
    Zeng, Donglin
    Laber, Eric B.
    Kosorok, Michael R.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (510) : 583 - 598
  • [8] Nonparametric Bayesian Q-learning for optimization of dynamic treatment regimes in the presence of partial compliance
    Bhattacharya, Indrabati
    Ertefaie, Ashkan
    Lynch, Kevin G.
    McKay, James R.
    Johnson, Brent A.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2023, 32 (09) : 1649 - 1663
  • [9] Identifying optimally cost-effective dynamic treatment regimes with a Q-learning approach
    Illenberger, Nicholas
    Spieker, Andrew J.
    Mitra, Nandita
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2023, 72 (02) : 434 - 449
  • [10] Fundamental Q-learning Algorithm in Finding Optimal Policy
    Sun, Canyu
    2017 INTERNATIONAL CONFERENCE ON SMART GRID AND ELECTRICAL AUTOMATION (ICSGEA), 2017, : 243 - 246