共 50 条
- [31] An ε-Greedy Multiarmed Bandit Approach to Markov Decision Processes STATS, 2023, 6 (01): : 99 - 112
- [33] Optimality of Myopic Policy for Restless Multiarmed Bandit with Imperfect Observation 2016 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2016,