Intrinsic Decay Property of Ti/TiOx/Pt Memristor for Reinforcement Learning

被引:3
|
作者
Dai, Yuehua [1 ]
Guo, Wenbin [1 ]
Feng, Zhe [1 ]
Xu, Zuyu [1 ]
Zhu, Yunlai [1 ]
Yang, Fei [1 ]
Wu, Zuheng [1 ]
机构
[1] Anhui Univ, Sch Integrated Circuits, Hefei 230601, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
conductance decay; memristors; path planning; reinforcement learning; Sarsa (lambda); ACCURACY;
D O I
10.1002/aisy.202200455
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A memristor-based reinforcement learning (RL) system has shown outstanding performance in achieving efficient autonomous decision-making and edge computing. Sarsa (?) is a classical multistep RL algorithm that records state with ? decay and guides policy updates, significantly improving the algorithm convergence speed. However, ? decay implementation of traditional computing hardware is confined by the extensive computation of power exponential decay. Herein, the value update equation for Sarsa (?) is implemented by using the topological structure of the memristor array, without complex circuits. Where, most importantly, the critical ? decay function is realized by a TiOx-based memristor with conductance decay property. The energy required for floating-point operations can be significantly reduced while accelerating the convergence speed. Then, a path planning task is demonstrated based on intrinsic conductance decay property and shows outstanding performance. Finally, the information of rounds used for the task is obtained, which is based on the intrinsic decay property of the TiOx-based memristor, maps into a 32 x 32 memristor array in parallel to calculate the value of each round. The results indicate that the experimental data have similar results to the simulations. Herein, thus, it provides a hardware-enabled scheme for the memristor-based RL algorithm implementation.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Pt/TiOx/Ti-based Dynamic Optoelectronic Memristor for Neuromorphic Computing
    Huang, Heyi
    Tang, Jianshi
    Gao, Bin
    Wang, Yuyan
    Li, Xinyi
    Wang, Ze
    Qian, He
    Wu, Huaqiang
    6TH IEEE ELECTRON DEVICES TECHNOLOGY AND MANUFACTURING CONFERENCE (EDTM 2022), 2022, : 310 - 312
  • [2] Improvement of Rectifying Property in Pt/TiOx/Pt by Controlling Oxidization of TiOx Layer
    Zhong, Ni
    Shima, Hisashi
    Akinaga, Hiro
    JAPANESE JOURNAL OF APPLIED PHYSICS, 2011, 50 (04)
  • [3] Convertible Volatile and non-Volatile Resistive Switching in a Self-rectifying Pt/TiOx/Ti Memristor
    Wu, Zuheng
    Zhang, Xumeng
    Shi, Tuo
    Wang, Yongzhou
    Wang, Rui
    Lu, Jian
    Wei, Jinsong
    Zhang, Peiwen
    Liu, Qi
    2021 5TH IEEE ELECTRON DEVICES TECHNOLOGY & MANUFACTURING CONFERENCE (EDTM), 2021,
  • [4] TiN/TiOx/WOx/Pt heterojunction memristor for sensory and neuromorphic computing
    Ju, Dongyeol
    Lee, Jungwoo
    So, Hyojin
    Kim, Sungjun
    JOURNAL OF ALLOYS AND COMPOUNDS, 2024, 1004
  • [5] Reinforcement learning with analogue memristor arrays
    Wang, Zhongrui
    Li, Can
    Song, Wenhao
    Rao, Mingyi
    Belkin, Daniel
    Li, Yunning
    Yan, Peng
    Jiang, Hao
    Lin, Peng
    Hu, Miao
    Strachan, John Paul
    Ge, Ning
    Barnell, Mark
    Wu, Qing
    Bartos, Andrew G.
    Qiu, Qinru
    Williams, R. Stanley
    Xia, Qiangfei
    Yang, J. Joshua
    NATURE ELECTRONICS, 2019, 2 (03) : 115 - 124
  • [6] Reinforcement learning with analogue memristor arrays
    Zhongrui Wang
    Can Li
    Wenhao Song
    Mingyi Rao
    Daniel Belkin
    Yunning Li
    Peng Yan
    Hao Jiang
    Peng Lin
    Miao Hu
    John Paul Strachan
    Ning Ge
    Mark Barnell
    Qing Wu
    Andrew G. Barto
    Qinru Qiu
    R. Stanley Williams
    Qiangfei Xia
    J. Joshua Yang
    Nature Electronics, 2019, 2 : 115 - 124
  • [7] Learning Intrinsic Symbolic Rewards in Reinforcement Learning
    Sheikh, Hassam Ullah
    Khadka, Shauharda
    Miret, Santiago
    Majumdar, Somdeb
    Phielipp, Mariano
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [8] TSSM: Three-State Switchable Memristor Model Based on Ag/TiOx Nanobelt/Ti Configuration
    Ji, Xiaoyue
    Qi, Donglian
    Dong, Zhekang
    Lai, Chun Sing
    Zhou, Guangdong
    Hu, Xiaofang
    INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 2021, 31 (07):
  • [9] Intrinsic Motivation and Introspection in Reinforcement Learning
    Merrick, Kathryn E.
    IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, 2012, 4 (04) : 315 - 329
  • [10] Adversarial Intrinsic Motivation for Reinforcement Learning
    Durugkar, Ishan
    Tec, Mauricio
    Niekum, Scott
    Stone, Peter
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34