Quantum data compression by principal component analysis

被引:53
|
作者
Yu, Chao-Hua [1 ,2 ,3 ]
Gao, Fei [1 ,4 ]
Lin, Song [5 ]
Wang, Jingbo [3 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[2] State Key Lab Cryptol, POB 5159, Beijing 100878, Peoples R China
[3] Univ Western Australia, Sch Phys, Perth, WA 6009, Australia
[4] Ctr Quantum Comp, Peng Cheng Lab, Shenzhen 518055, Peoples R China
[5] Fujian Normal Univ, Coll Math & Informat, Fuzhou 350007, Fujian, Peoples R China
关键词
Quantum algorithm; Data compression; Principal component analysis; Quantum machine learning; Curse of dimensionality;
D O I
10.1007/s11128-019-2364-9
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Data compression can be achieved by reducing the dimensionality of high-dimensional but approximately low-rank datasets, which may in fact be described by the variation of a much smaller number of parameters. It often serves as a preprocessing step to surmount the curse of dimensionality and to gain efficiency, and thus it plays an important role in machine learning and data mining. In this paper, we present a quantum algorithm that compresses an exponentially large high-dimensional but approximately low-rank dataset in quantum parallel, by dimensionality reduction (DR) based on principal component analysis (PCA), the most popular classical DR algorithm. We show that the proposed algorithm has a runtime polylogarithmic in the dataset's size and dimensionality, which is exponentially faster than the classical PCA algorithm, when the original dataset is projected onto a polylogarithmically low-dimensional space. The compressed dataset can then be further processed to implement other tasks of interest, with significantly less quantum resources. As examples, we apply this algorithm to reduce data dimensionality for two important quantum machine learning algorithms, quantum support vector machine and quantum linear regression for prediction. This work demonstrates that quantum machine learning can be released from the curse of dimensionality to solve problems of practical importance.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] Principal Component Analysis of Thermographic Data
    Winfree, William P.
    Cramer, K. Elliott
    Zalameda, Joseph N.
    Howell, Patricia A.
    Burke, Eric R.
    THERMOSENSE: THERMAL INFRARED APPLICATIONS XXXVII, 2015, 9485
  • [22] Principal component analysis of genetic data
    Reich, David
    Price, Alkes L.
    Patterson, Nick
    NATURE GENETICS, 2008, 40 (05) : 491 - 492
  • [23] Principal component analysis with autocorrelated data
    Zamprogno, Bartolomeu
    Reisen, Valderio A.
    Bondon, Pascal
    Aranda Cotta, Higor H.
    Reis Jr, Neyval C.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2020, 90 (12) : 2117 - 2135
  • [24] PRINCIPAL COMPONENT ANALYSIS OF EPIDEMIOLOGICAL DATA
    OSAKI, J
    ISHII, F
    IWAMOTO, S
    SHINBO, S
    BIOMETRICS, 1982, 38 (04) : 1101 - 1101
  • [25] Principal component analysis on interval data
    Federica Gioia
    Carlo N. Lauro
    Computational Statistics, 2006, 21 : 343 - 363
  • [26] Principal component analysis on interval data
    Gioia, Federica
    Lauro, Carlo N.
    COMPUTATIONAL STATISTICS, 2006, 21 (02) : 343 - 363
  • [27] Synthetic Data by Principal Component Analysis
    Sano, Natsuki
    20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2020), 2020, : 101 - 105
  • [28] PRINCIPAL COMPONENT ANALYSIS OF COMPOSITIONAL DATA
    AITCHISON, J
    BIOMETRIKA, 1983, 70 (01) : 57 - 65
  • [29] PRINCIPAL COMPONENT ANALYSIS OF PRODUCTION DATA
    WILLIAMS, JH
    RADIO AND ELECTRONIC ENGINEER, 1974, 44 (09): : 473 - 480
  • [30] Principal component analysis for interval data
    Billard, L.
    Le-Rademacher, J.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2012, 4 (06): : 535 - 540