On Learning Mixtures of Well-Separated Gaussians

被引：35

作者：

Regev, Oded ^{[1
]}

Vijayaraghavan, Aravindan ^{[2
]}

机构：

[1] NYU, Courant Inst Math Sci, New York, NY 10003 USA

[2] Northwestern Univ, Dept EECS, Evanston, IL USA

来源：

2017 IEEE 58TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS) | 2017年

基金：

美国国家科学基金会;

关键词：

mixtures of Gaussians; unsupervised learning; clustering; parameter estimation; sample complexity; iterative algorithms; MODELS;

D O I：

10.1109/FOCS.2017.17

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

We consider the problem of efficiently learning mixtures of a large number of spherical Gaussians, when the components of the mixture are well separated. In the most basic form of this problem, we are given samples from a uniform mixture of k standard spherical Gaussians with means mu(1),..., mu(k) is an element of R-d, and the goal is to estimate the means up to accuracy delta using poly(k, d, 1/delta) samples. In this work, we study the following question: what is the minimum separation needed between the means for solving this task? The best known algorithm due to Vempala and Wang [JCSS 2004] requires a separation of roughly min{k, d}(1/4). On the other hand, Moitra and Valiant [FOCS 2010] showed that with separation o(1), exponentially many samples are required. We address the significant gap between these two bounds, by showing the following results. We show that with separation o(root log k), superpolynomially many samples are required. In fact, this holds even when the k means of the Gaussians are picked at random in d = O(log k) dimensions. We show that with separation Omega(root log k), poly(k, d, 1/delta) samples suffice. Notice that the bound on the separation is independent of delta. This result is based on a new and efficient "accuracy boosting" algorithm that takes as input coarse estimates of the true means and in time (and samples) poly(k, d, 1/d) outputs estimates of the means up to arbitrarily good accuracy d assuming the separation between the means is Omega(min{root log k,root d}) (independently of delta). The idea of the algorithm is to iteratively solve a "diagonally dominant" system of non-linear equations. We also (1) present a computationally efficient algorithm in d = O(1) dimensions with only Omega(root d) separation, and (2) extend our results to the case that components might have different weights and variances. These results together essentially characterize the optimal order of separation between components that is needed to learn a mixture of k spherical Gaussians with polynomial samples.

引用

页码：85 / 96

页数：12

共 50 条

[31] SPITZER MICROLENS MEASUREMENT OF A MASSIVE REMNANT IN A WELL-SEPARATED BINARY
Shvartzvald, Y.
Udalski, A.
Gould, A.
Han, C.
Bozza, V.
Friedmann, M.
Hundertmark, M.
Beichman, C.
Bryden, G.
Novati, S. Calchi
Carey, S.
Fausnaugh, M.
Gaudi, B. S.
Henderson, C. B.
Kerr, T.
Pogge, R. W.
Varricatt, W.
Wibking, B.
Yee, J. C.
Zhu, W.
Poleski, R.
Pawlak, M.
Szymanski, M. K.
Skowron, J.
Mroz, P.
Kozlowski, S.
Wyrzykowski, L.
Pietrukowicz, P.
Pietrzynski, G.
Soszynski, I.
Ulaczyk, K.
Choi, J. -Y.
Park, H.
Jung, Y. K.
Shin, I. -G.
Albrow, M. D.
Park, B. -G.
Kim, S. -L.
Lee, C. -U.
Cha, S. -M.
Kim, D. -J.
Lee, Y.
Maoz, D.
Kaspi, S.
Street, R. A.
Tsapras, Y.
Bachelet, E.
Dominik, M.
Bramich, D. M.
Horne, Keith
ASTROPHYSICAL JOURNAL, 2015, 814 (02):
[32] Efficiently Learning Mixtures of Two Gaussians
Kalai, Adam Tauman
Moitra, Ankur
Valiant, Gregory
STOC 2010: PROCEEDINGS OF THE 2010 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2010, : 553 - 562
[33] Robustly Learning General Mixtures of Gaussians
Liu, Allen
Moitra, Ankur
JOURNAL OF THE ACM, 2023, 70 (03)
[34] Learning Mixtures of Gaussians in High Dimensions
Ge, Rong
Huang, Qingqing
Kakade, Sham M.
STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2015, : 761 - 770
[35] Logarithmic negativity and spectrum in free fermionic systems for well-separated intervals
Bettelheim, Eldad
JOURNAL OF PHYSICS A-MATHEMATICAL AND THEORETICAL, 2023, 56 (45)
[36] Intrachromosomal recombination between well-separated, homologous sequences in mammalian cells
Baker, MD
Read, LR
Ng, P
Beatty, BG
GENETICS, 1999, 152 (02) : 685 - 697
[37] Moduli space metric for well-separated non-Abelian vortices
Fujimori, Toshiaki
Marmorini, Giacomo
Nitta, Muneto
Ohashi, Keisuke
Sakai, Norisuke
PHYSICAL REVIEW D, 2010, 82 (06):
[38] Synthesis of nanoparticles and nanotubes with well-separated layers of boron nitride and carbon
Suenaga, K
Colliex, C
Demoncy, N
Loiseau, A
Pascard, H
Willaime, F
SCIENCE, 1997, 278 (5338) : 653 - 655
[39] A DEVICE FOR RECORDING WELL-SEPARATED SIMILAR ABSORPTION SPECTRA ON A SINGLE SHEET
KEELER, RF
PATTERSO.R
CHEMIST-ANALYST, 1965, 54 (03) : 88 - &
[40] Synthesis of nanoparticles and nanotubes with well-separated layers of boron nitride and carbon
Laboratoire de Physique des Solides, URA 002, Université de Paris-Sud, 91405 Orsay, France
不详
不详
不详
不详
不详
SCIENCE, 5338 (653-655):

← 1 2 3 4 5 →