Tradeoffs for Space, Time, Data and Risk in Unsupervised Learning

被引:0
|
作者
Lucic, Mario [1 ]
Ohannessian, Mesrob, I [2 ]
Karbasi, Amin [3 ]
Krause, Andreas [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] Univ Calif San Diego, La Jolla, CA 92093 USA
[3] Yale Univ, New Haven, CT 06520 USA
关键词
FRAMEWORK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Faced with massive data, is it possible to trade off (statistical) risk, and (computational) space and time? This challenge lies at the heart of large-scale machine learning. Using k-means clustering as a prototypical unsupervised learning problem, we show how we can strategically summarize the data (control space) in order to trade off risk and time when data is generated by a probabilistic model. Our summarization is based on coreset constructions from computational geometry. We also develop an algorithm, TRAM, to navigate the space/time/data/risk tradeoff in practice. In particular, we show that for a fixed risk (or data size), as the data size increases (resp. risk increases) the running time of TRAM decreases. Our extensive experiments on real data sets demonstrate the existence and practical utility of such tradeoffs, not only for k-means but also for Gaussian Mixture Models.
引用
收藏
页码:663 / 671
页数:9
相关论文
共 50 条
  • [41] Space-Time Tradeoffs for Approximate Spherical Range Counting
    Arya, Sunil
    Malamatos, Theocharis
    Mount, David M.
    PROCEEDINGS OF THE SIXTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2005, : 535 - 544
  • [42] Analysis of space-time tradeoffs in photonic switching networks
    Qiao, CM
    IEEE INFOCOM '96 - FIFTEENTH ANNUAL JOINT CONFERENCE OF THE IEEE COMPUTER AND COMMUNICATIONS SOCIETIES: NETWORKING THE NEXT GENERATION, PROCEEDINGS VOLS 1-3, 1996, : 822 - 829
  • [43] Unsupervised Data Driven Taxonomy Learning
    Hosny, Mahmoud M.
    El-Beltagy, Samhaa R.
    Allam, Mahmoud E.
    2015 FIRST INTERNATIONAL CONFERENCE ON ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2015): ADVANCES IN ARABIC COMPUTATIONAL LINGUISTICS, 2015, : 9 - 14
  • [44] Unsupervised learning of data principal eigenstructure
    Shirazi, MN
    Peper, F
    Sawai, H
    PROGRESS IN CONNECTIONIST-BASED INFORMATION SYSTEMS, VOLS 1 AND 2, 1998, : 612 - 615
  • [45] Exploring Automated Space/Time Tradeoffs for OpenVX Compute Graphs
    Omidian, Hossein
    Lemieux, Guy G. F.
    2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 152 - 159
  • [46] Space-Time Tradeoffs for Proximity Searching in Doubling Spaces
    Arya, Sunil
    Mount, David M.
    Vigneron, Antoine
    Xia, Jian
    ALGORITHMS - ESA 2008, 2008, 5193 : 112 - +
  • [47] Width-Parameterized SAT: Time-space tradeoffs
    Allender, Eric
    Chen, Shiteng
    Lou, Tiancheng
    Papakonstantinou, Periklis A.
    Tang, Bangsheng
    Theory of Computing, 2014, 10 : 297 - 339
  • [48] Randomized time-space tradeoffs for directed graph connectivity
    Gopalan, P
    Lipton, RJ
    Mehta, A
    FST TCS 2003: FOUNDATIONS OF SOFTWARE TECHNOLOGY AND THEORETICAL COMPUTER SCIENCE, 2003, 2914 : 208 - 216
  • [49] Space Performance Tradeoffs in Compressing MPI Group Data Structures
    Kumar, Sameer
    Heidelberger, Philip
    Stunkel, Craig
    PROCEEDINGS OF THE 23RD EUROPEAN MPI USERS' GROUP MEETING (EUROMPI 2016), 2016, : 32 - 40
  • [50] Space-Time Tradeoffs for Conjunctive Queries with Access Patterns
    Zhao, Hangdong
    Deep, Shaleen
    Koutris, Paraschos
    PROCEEDINGS OF THE 42ND ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, PODS 2023, 2023, : 59 - 68