Tradeoffs for Space, Time, Data and Risk in Unsupervised Learning

被引:0
|
作者
Lucic, Mario [1 ]
Ohannessian, Mesrob, I [2 ]
Karbasi, Amin [3 ]
Krause, Andreas [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] Univ Calif San Diego, La Jolla, CA 92093 USA
[3] Yale Univ, New Haven, CT 06520 USA
关键词
FRAMEWORK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Faced with massive data, is it possible to trade off (statistical) risk, and (computational) space and time? This challenge lies at the heart of large-scale machine learning. Using k-means clustering as a prototypical unsupervised learning problem, we show how we can strategically summarize the data (control space) in order to trade off risk and time when data is generated by a probabilistic model. Our summarization is based on coreset constructions from computational geometry. We also develop an algorithm, TRAM, to navigate the space/time/data/risk tradeoff in practice. In particular, we show that for a fixed risk (or data size), as the data size increases (resp. risk increases) the running time of TRAM decreases. Our extensive experiments on real data sets demonstrate the existence and practical utility of such tradeoffs, not only for k-means but also for Gaussian Mixture Models.
引用
收藏
页码:663 / 671
页数:9
相关论文
共 50 条
  • [31] New applications of time memory data tradeoffs
    Hong, J
    Sarkar, P
    ADVANCES IN CRYPTOLOGY ASIACRYPT 2005, 2005, 3788 : 353 - 372
  • [32] SPACE-TIME GUIDED ASSOCIATION LEARNING FOR UNSUPERVISED PERSON RE-IDENTIFICATION
    Wu, Chih-Wei
    Liu, Chih-Ting
    Tu, Wei-Chih
    Tsao, Yu
    Wang, Yu-Chiang Frank
    Chien, Shao-Yi
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2261 - 2265
  • [33] Unsupervised learning based on Dynamic Bayesian Network in time-variable sample space
    School of Aerospace, Tsinghua University, Beijing 100084, China
    Xitong Fangzhen Xuebao, 2008, 5 (1203-1208): : 1203 - 1208
  • [34] Nondeterministic polynomial time versus nondeterministic logarithmic space: Time-space tradeoffs for satisfiability
    Fortnow, L
    TWELFTH ANNUAL IEEE CONFERENCE ON COMPUTATIONAL COMPLEXITY, PROCEEDINGS, 1997, : 52 - 60
  • [35] Risk and Pattern Analysis of Pakistani Crime Data Using Unsupervised Learning Techniques
    Ferooz, Faria
    Hassan, Malik Tahir
    Mahmood, Sajid
    Asim, Hira
    Idrees, Muhammad
    Assam, Muhammad
    Mohamed, Abdullah
    Attia, El-Awady
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [36] Unsupervised Learning Applied to the Stratification of Preterm Birth Risk in Brazil with Socioeconomic Data
    Lopes, Marcio L. B.
    Barbosa, Raquel de M.
    Fernandes, Marcelo A. C.
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (09)
  • [37] Learning Representations from Healthcare Time Series Data for Unsupervised Anomaly Detection
    Pereira, Joao
    Silveira, Margarida
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2019, : 84 - 90
  • [38] Time-Space Tradeoffs in Resolution: Superpolynomial Lower Bounds for Superlinear Space
    Beame, Paul
    Beck, Chris
    Impagliazzo, Russell
    STOC'12: PROCEEDINGS OF THE 2012 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2012, : 213 - 231
  • [39] Unsupervised Learning and Exploration of Reachable Outcome Space
    Paolo, Giuseppe
    Laflaquiere, Alban
    Coninx, Alexandre
    Doncieux, Stephane
    2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 2379 - 2385
  • [40] Unsupervised Federated Learning for Unbalanced Data
    Servetnyk, Mykola
    Fung, Carrson C.
    Han, Zhu
    2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,