Towards provably efficient quantum algorithms for large-scale machine-learning models

被引:17
|
作者
Liu, Junyu [1 ,2 ,3 ,4 ,5 ,6 ]
Liu, Minzhao [7 ,8 ]
Liu, Jin-Peng [9 ,10 ,11 ]
Ye, Ziyu [2 ]
Wang, Yunfei [12 ]
Alexeev, Yuri [2 ,3 ,8 ]
Eisert, Jens [13 ]
Jiang, Liang [1 ,3 ]
机构
[1] Univ Chicago, Pritzker Sch Mol Engn, Chicago, IL 60637 USA
[2] Univ Chicago, Dept Comp Sci, Chicago, IL 60637 USA
[3] Chicago Quantum Exchange, Chicago, IL 60637 USA
[4] Univ Chicago, Kadanoff Ctr Theoret Phys, Chicago, IL 60637 USA
[5] qBraid Co, Chicago, IL 60615 USA
[6] SeQure, Chicago, IL 60615 USA
[7] Univ Chicago, Dept Phys, Chicago, IL 60637 USA
[8] Argonne Natl Lab, Computat Sci Div, Lemont, IL 60439 USA
[9] Univ Calif Berkeley, Simons Inst Theory Comp, Berkeley, CA 94720 USA
[10] Univ Calif Berkeley, Dept Math, Berkeley, CA 94720 USA
[11] MIT, Ctr Theoret Phys, Cambridge, MA 02139 USA
[12] Brandeis Univ, Martin A Fisher Sch Phys, Waltham, MA 02453 USA
[13] Free Univ Berlin, Dahlem Ctr Complex Quantum Syst, D-14195 Berlin, Germany
关键词
D O I
10.1038/s41467-023-43957-x
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Large machine learning models are revolutionary technologies of artificial intelligence whose bottlenecks include huge computational expenses, power, and time used both in the pre-training and fine-tuning process. In this work, we show that fault-tolerant quantum computing could possibly provide provably efficient resolutions for generic (stochastic) gradient descent algorithms, scaling as O(T-2 x polylog(n)), where n is the size of the models and T is the number of iterations in the training, as long as the models are both sufficiently dissipative and sparse, with small learning rates. Based on earlier efficient quantum algorithms for dissipative differential equations, we find and prove that similar algorithms work for (stochastic) gradient descent, the primary algorithm for machine learning. In practice, we benchmark instances of large machine learning models from 7 million to 103 million parameters. We find that, in the context of sparse training, a quantum enhancement is possible at the early stage of learning after model pruning, motivating a sparse parameter download and re-upload scheme. Our work shows solidly that fault-tolerant quantum algorithms could potentially contribute to most state-of-the-art, large-scale machine-learning problems.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Towards provably efficient quantum algorithms for large-scale machine-learning models
    Junyu Liu
    Minzhao Liu
    Jin-Peng Liu
    Ziyu Ye
    Yunfei Wang
    Yuri Alexeev
    Jens Eisert
    Liang Jiang
    Nature Communications, 15
  • [2] Efficient surrogate modeling methods for large-scale Earth system models based on machine-learning techniques
    Lu, Dan
    Ricciuto, Daniel
    GEOSCIENTIFIC MODEL DEVELOPMENT, 2019, 12 (05) : 1791 - 1807
  • [3] Open Challenges in Developing Generalizable Large-Scale Machine-Learning Models for Catalyst Discovery
    Kolluru, Adeesh
    Shuaibi, Muhammed
    Palizhati, Aini
    Shoghi, Nima
    Das, Abhishek
    Wood, Brandon
    Zitnick, C. Lawrence
    Kitchin, John R.
    Ulissi, Zachary W.
    ACS CATALYSIS, 2022, 12 (14): : 8572 - 8581
  • [4] Reproducing Reaction Mechanisms with Machine-Learning Models Trained on a Large-Scale Mechanistic Dataset
    Joung, Joonyoung F.
    Fong, Mun Hong
    Roh, Jihye
    Tu, Zhengkai
    Bradshaw, John
    Coley, Connor W.
    ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2024, 63 (43)
  • [5] Efficient Machine Learning On Large-Scale Graphs
    Erickson, Parker
    Lee, Victor E.
    Shi, Feng
    Tang, Jiliang
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4788 - 4789
  • [6] Efficient algorithms for large-scale quantum transport calculations
    Bruck, Sascha
    Calderara, Mauro
    Bani-Hashemian, Mohammad Hossein
    VandeVondele, Joost
    Luisier, Mathieu
    JOURNAL OF CHEMICAL PHYSICS, 2017, 147 (07):
  • [7] A Machine-Learning Approach for Communication Prediction of Large-Scale Applications
    Papadopoulou, Nikela
    Goumas, Georgios
    Koziris, Nectarios
    2015 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING - CLUSTER 2015, 2015, : 120 - 123
  • [8] Efficient Distributed Machine Learning for Large-scale Models by Reducing Redundant Communication
    Yokoyama, Harumichi
    Araki, Takuya
    2017 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTED, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2017,
  • [9] Large-Scale Machine Learning Algorithms for Biomedical Data Science
    Huang, Heng
    ACM-BCB'19: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, 2019, : 4 - 4
  • [10] On Efficient Training of Large-Scale Deep Learning Models
    Shen, Li
    Sun, Yan
    Yu, Zhiyuan
    Ding, Liang
    Tian, Xinmei
    Tao, Dacheng
    ACM COMPUTING SURVEYS, 2025, 57 (03)