Finding the ground state of spin Hamiltonians with reinforcement learning

被引：21

作者：

Mills, Kyle ^{[1
,2
,3
]}

Ronagh, Pooya ^{[1
,4
,5
]}

Tamblyn, Isaac ^{[2
,3
,6
]}

机构：

[1] 1QB Informat Technol 1QBit, Vancouver, BC, Canada

[2] Univ Ontario Inst Technol, Oshawa, ON, Canada

[3] Vector Inst Artificial Intelligence, Toronto, ON, Canada

[4] Inst Quantum Comp IQC, Waterloo, ON, Canada

[5] Univ Waterloo, Dept Phys & Astron, Waterloo, ON, Canada

[6] Natl Res Council Canada, Ottawa, ON, Canada

来源：

NATURE MACHINE INTELLIGENCE | 2020年 / 2卷 / 09期

基金：

加拿大自然科学与工程研究理事会;

关键词：

QUANTUM; OPTIMIZATION; MODEL; ALGORITHM; GO; EFFICIENT; GAME;

D O I：

10.1038/s42256-020-0226-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning has become a popular method in various domains, for problems where an agent must learn what actions must be taken to reach a particular goal. An interesting example where the technique can be applied is simulated annealing in condensed matter physics, where a procedure is determined for slowly cooling a complex system to its ground state. A reinforcement learning approach has been developed that can learn a temperature scheduling protocol to find the ground state of spin glasses, magnetic systems with strong spin-spin interactions between neighbouring atoms. Reinforcement learning (RL) has become a proven method for optimizing a procedure for which success has been defined, but the specific actions needed to achieve it have not. Using a method we call 'controlled online optimization learning' (COOL), we apply the so-called 'black box' method of RL to simulated annealing (SA), demonstrating that an RL agent based on proximal policy optimization can, through experience alone, arrive at a temperature schedule that surpasses the performance of standard heuristic temperature schedules for two classes of Hamiltonians. When the system is initialized at a cool temperature, the RL agent learns to heat the system to 'melt' it and then slowly cool it in an effort to anneal to the ground state; if the system is initialized at a high temperature, the algorithm immediately cools the system. We investigate the performance of our RL-driven SA agent in generalizing to all Hamiltonians of a specific class. When trained on random Hamiltonians of nearest-neighbour spin glasses, the RL agent is able to control the SA process for other Hamiltonians, reaching the ground state with a higher probability than a simple linear annealing schedule. Furthermore, the scaling performance (with respect to system size) of the RL approach is far more favourable, achieving a performance improvement of almost two orders of magnitude onL= 14(2)systems. We demonstrate the robustness of the RL approach when the system operates in a 'destructive observation' mode, an allusion to a quantum system where measurements destroy the state of the system. The success of the RL agent could have far-reaching impacts, from classical optimization, to quantum annealing and to the simulation of physical systems.

引用

页码：509 / 517

页数：9

共 50 条

[21] Prediction based segmentation of state space and application to a subgoal finding problem in reinforcement learning
Nagata, Y
Ohigashi, Y
Takahashi, H
Ishikawa, S
Omori, T
Morikawa, K
SICE 2004 ANNUAL CONFERENCE, VOLS 1-3, 2004, : 2560 - 2565
[22] State of the profession - Finding common ground
Kerr, J
VETERINARY TECHNICIAN, 2005, 26 (09): : 646 - 649
[23] Quantum mechanical Hamiltonians with large ground-state degeneracy
Lee, Choonkyu
Lee, Kimyeong
ANNALS OF PHYSICS, 2013, 331 : 258 - 268
[24] Ground-state spaces of frustration-free Hamiltonians
Chen, Jianxin
Ji, Zhengfeng
Kribs, David
Wei, Zhaohui
Zeng, Bei
JOURNAL OF MATHEMATICAL PHYSICS, 2012, 53 (10)
[25] SPIN OF THE GROUND-STATE
GALINDO, A
JOURNAL OF PHYSICS A-MATHEMATICAL AND GENERAL, 1989, 22 (13): : L593 - L596
[26] The spin of the ground state of an atom
Fefferman, CL
Seco, LA
REVISTA MATEMATICA IBEROAMERICANA, 1996, 12 (01) : 19 - 36
[27] Spin of ground state baryons
Buchmann, A. J.
Henley, E. M.
PHYSICAL REVIEW D, 2011, 83 (09):
[28] Learning ground states of gapped quantum Hamiltonians with Kernel Methods
Giuliani, Clemens
Vicentini, Filippo
Rossi, Riccardo
Carleo, Giuseppe
QUANTUM, 2023, 7
[29] FINDING GEODESICS ON GRAPHS USING REINFORCEMENT LEARNING
Kious, Daniel
Mailler, Cecile
Schapira, Bruno
ANNALS OF APPLIED PROBABILITY, 2022, 32 (05): : 3889 - 3929
[30] A Method for Finding Multiple Subgoals for Reinforcement Learning
Ogihara, Fuminori
Murata, Junichi
PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 16TH '11), 2011, : 804 - 807

← 1 2 3 4 5 →