Compressed Linear Algebra for Large-Scale Machine Learning

被引:1
|
作者
Elgohary, Ahmed [1 ,2 ]
Boehm, Matthias [1 ]
Haas, Peter J. [1 ]
Reiss, Frederick R. [1 ]
Reinwald, Berthold [1 ]
机构
[1] IBM Res Almaden, San Jose, CA 95120 USA
[2] Univ Maryland, College Pk, MD 20742 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2016年 / 9卷 / 12期
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Large-scale machine learning (ML) algorithms are often iterative, using repeated read-only data access and I/Obound matrix-vector multiplications to converge to an optimal model. It is crucial for performance to fit the data into single-node or distributed main memory. General-purpose, heavy-and lightweight compression techniques struggle to achieve both good compression ratios and fast decompression speed to enable block-wise uncompressed operations. Hence, we initiate work on compressed linear algebra (CLA), in which lightweight database compression techniques are applied to matrices and then linear algebra operations such as matrix-vector multiplication are executed directly on the compressed representations. We contribute effective column compression schemes, cache-conscious operations, and an efficient sampling-based compression algorithm. Our experiments show that CLA achieves in-memory operations performance close to the uncompressed case and good compression ratios that allow us to fit larger datasets into available memory. We thereby obtain significant end-to-end performance improvements up to 26x or reduced memory requirements.
引用
收藏
页码:960 / 971
页数:12
相关论文
共 50 条
  • [31] Large-Scale Machine Learning with Stochastic Gradient Descent
    Bottou, Leon
    COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
  • [32] Quick extreme learning machine for large-scale classification
    Albtoush, Audi
    Fernandez-Delgado, Manuel
    Cernadas, Eva
    Barro, Senen
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (08): : 5923 - 5938
  • [33] Angel: a new large-scale machine learning system
    Jie Jiang
    Lele Yu
    Jiawei Jiang
    Yuhong Liu
    Bin Cui
    National Science Review, 2018, 5 (02) : 216 - 236
  • [34] Large-scale linear nonparallel support vector machine solver
    Tian, Yingjie
    Zhang, Qin
    Ping, Yuan
    NEUROCOMPUTING, 2014, 138 : 114 - 119
  • [35] Large-scale linear nonparallel support vector machine solver
    Tian, Yingjie
    Ping, Yuan
    NEURAL NETWORKS, 2014, 50 : 166 - 174
  • [36] Using desktop computers to solve large-scale dense linear algebra problems
    Marques, M.
    Quintana-Orti, G.
    Quintana-Orti, E. S.
    van de Geijn, R.
    JOURNAL OF SUPERCOMPUTING, 2011, 58 (02): : 145 - 150
  • [37] Using desktop computers to solve large-scale dense linear algebra problems
    M. Marqués
    G. Quintana-Ortí
    E. S. Quintana-Ortí
    R. van de Geijn
    The Journal of Supercomputing, 2011, 58 : 145 - 150
  • [38] Towards an Optimized GROUP BY Abstraction for Large-Scale Machine Learning
    Li, Side
    Kumar, Arun
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (11): : 2327 - 2340
  • [39] Toward Large-Scale Vulnerability Discovery using Machine Learning
    Grieco, Gustavo
    Grinblat, Guillermo Luis
    Uzal, Lucas
    Rawat, Sanjay
    Feist, Josselin
    Mounier, Laurent
    CODASPY'16: PROCEEDINGS OF THE SIXTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, 2016, : 85 - 96
  • [40] Lotus: A New Topology for Large-scale Distributed Machine Learning
    Lu, Yunfeng
    Gu, Huaxi
    Yu, Xiaoshan
    Chakrabarty, Krishnendu
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2021, 17 (01)