Testing Identity of Multidimensional Histograms

被引：0

作者：

Diakonikolas, Ilias ^{[1
]}

Kane, Daniel M. ^{[2
]}

Peebles, John ^{[3
]}

机构：

[1] Univ Southern Calif, Los Angeles, CA 90007 USA

[2] Univ Calif San Diego, La Jolla, CA USA

[3] MIT, Cambridge, MA USA

来源：

CONFERENCE ON LEARNING THEORY, VOL 99 | 2019年 / 99卷

关键词：

distribution testing; hypothesis testing; goodness of fit; multivariate histograms; MULTIVARIATE HISTOGRAMS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We investigate the problem of identity testing for multidimensional histogram distributions. A distribution p : D -> R+, where D subset of R-d, is called a k -histogram if there exists a partition of the domain into k axis-aligned rectangles such that p is constant within each such rectangle. Histograms are one of the most fundamental nonparametric families of distributions and have been extensively studied in computer science and statistics. We give the first identity tester for this problem with sub-learning sample complexity in any fixed dimension and a nearly-matching sample complexity lower bound. In more detail, let q be an unknown d-dimensional k -histogram distribution in fixed dimension d, and p be an explicitly given d-dimensional k -histogram. We want to correctly distinguish, with probability at least 2/3, between the case that p = q versus ||p - q||(1) >= epsilon. We design an algorithm for this hypothesis testing problem with sample complexity O ((root k/epsilon(2))2(d/2) log(2:5d) (k/epsilon)) that runs in sample-polynomial time. Our algorithm is robust to model misspecification, i.e., succeeds even if q is only promised to be close to a k-histogram. Moreover, for k = 2(Omega(d),) we show a sample complexity lower bound of (root k/epsilon(2))center dot Omega(log(k)/d)(d-1) when d >= 2. That is, for any fixed dimension d, our upper and lower bounds are nearly matching. Prior to our work, the sample complexity of the d = 1 case was well-understood, but no algorithm with sub-learning sample complexity was known, even for d = 2. Our new upper and lower bounds have interesting conceptual implications regarding the relation between learning and testing in this setting.

引用

页数：25

共 50 条

[21] A FAST APPROXIMATION OF THE EARTH-MOVERS DISTANCE BETWEEN MULTIDIMENSIONAL HISTOGRAMS
Serratosa, Francesc
Sanroma, Gerard
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2008, 22 (08) : 1539 - 1558
[22] Social identity: is it a multidimensional construct?
Mancini, Tiziana
Montali, Arianna
PSICOLOGIA SOCIALE, 2009, 4 (01) : 67 - 94
[23] Multidimensional adaptive testing
Segall, DO
PSYCHOMETRIKA, 1996, 61 (02) : 331 - 354
[24] Pattern recognition of multidimensional PBMC flow cytometry histograms for prostate cancer identification
Tong, Dong L.
Ball, Graham R.
PROCEEDINGS IWBBIO 2013: INTERNATIONAL WORK-CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, 2013, : 509 - 516
[25] Classification of Multidimensional Time-Evolving Data Using Histograms of Grassmannian Points
Dimitropoulos, Kosmas
Barmpoutis, Panagiotis
Kitsikidis, Alexandros
Grammalidis, Nikos
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (04) : 892 - 905
[26] Design of high-performance C++ package for handling of multidimensional histograms
Bubak, M
Moscicki, JT
Shiers, J
HIGH-PERFORMANCE COMPUTING AND NETWORKING, PROCEEDINGS, 1999, 1593 : 543 - 552
[27] Heterosexual identity development: A multidimensional model of individual and social identity
Worthington, RL
Savoy, HB
Dillon, FR
Vernaglia, ER
COUNSELING PSYCHOLOGIST, 2002, 30 (04): : 496 - 531
[28] Measurement and simultaneous compression of multidimensional γ-ray histograms using address randomizing transforms
Morhac, M.
Matousek, V.
Turzo, I.
MEASUREMENT, 2009, 42 (08) : 1241 - 1256
[29] IN DEFENSE OF A MULTIDIMENSIONAL APPROACH TO SEXUAL IDENTITY
SUPPE, F
JOURNAL OF HOMOSEXUALITY, 1984, 10 (3-4) : 7 - 14
[30] MULTIDIMENSIONAL AND DYNAMIC NATURE OF ETHNIC IDENTITY
CHRISTIAN, J
GADFIELD, NJ
GILES, H
TAYLOR, DM
INTERNATIONAL JOURNAL OF PSYCHOLOGY, 1976, 11 (04) : 281 - 291

← 1 2 3 4 5 →