Testing the statistical significance of an ultra-high-dimensional naive Bayes classifier

被引:0
|
作者
An, Baiguo [1 ]
Wang, Hansheng [1 ]
Guo, Jianhua [1 ]
机构
[1] Peking Univ, Guanghua Sch Management, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Binary Predictor; Hypothesis Testing; Naive Bayes; Supervised Learning; Text Classification; Ultra-High-Dimensional Data; SELECTION;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The naive Bayes approach is one of the most popular methods used for classification. Nevertheless, how to test its statistical significance under an ultra-high-dimensional (UHD) setup is not well understood. To fill this important theoretical gap, we propose a novel testing statistic with a standard normal asymptotic null distribution, even if the predictor dimension is considerably larger than the sample size. This makes the proposed method useful for UHD data analysis. Simulation studies are presented to demonstrate its finite sample performance and a text classification example is described for illustration.
引用
收藏
页码:223 / 229
页数:7
相关论文
共 50 条
  • [21] Forward variable selection for sparse ultra-high-dimensional generalized varying coefficient models
    Honda, Toshio
    Lin, Chien-Tong
    JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 2021, 4 (01) : 151 - 179
  • [22] Forward variable selection for sparse ultra-high-dimensional generalized varying coefficient models
    Toshio Honda
    Chien-Tong Lin
    Japanese Journal of Statistics and Data Science, 2021, 4 : 151 - 179
  • [23] Naive Bayes ant colony optimization for designing high dimensional experiments
    Borrotti, M.
    Minervini, G.
    De Lucrezia, D.
    Poli, I.
    APPLIED SOFT COMPUTING, 2016, 49 : 259 - 268
  • [24] Kernel naive Bayes discrimination for high-dimensional pattern recognition
    Koch, Inge
    Naito, Kanta
    Tanaka, Hiroaki
    AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, 2019, 61 (04) : 401 - 428
  • [25] Enrichment of extremely noisy high-throughput screening data using a naive Bayes classifier
    Glick, M
    Klon, AE
    Acklin, P
    Davies, JW
    JOURNAL OF BIOMOLECULAR SCREENING, 2004, 9 (01) : 32 - 36
  • [26] Comparative evaluation of automated approaches for confounder selection in ultra-high-dimensional data with rare outcomes
    Wyss, Richard
    Van der Laan, Mark
    Gruber, Susan
    Shi, Xu
    Lee, Hana
    Dutcher, Sarah
    Toh, Sengwee
    Nelson, Jennifer
    Wang, Shirley
    Lin, Joshua
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2022, 31 : 61 - 61
  • [27] Comparative Study of RBF and Naive Bayes Classifier for Road Detection Using High Resolution Satellite Images
    Upadhyay, Anand
    Singh, Santosh
    Pandey, Ajay Kumar
    Singh, Nirbhay
    ADVANCED INFORMATICS FOR COMPUTING RESEARCH, PT I, 2019, 1075 : 383 - 392
  • [28] Combination of a naive Bayes classifier with consensus scoring improves enrichment of high-throughput docking results
    Klon, AE
    Glick, M
    Davies, JW
    JOURNAL OF MEDICINAL CHEMISTRY, 2004, 47 (18) : 4356 - 4359
  • [29] Naive Bayes classifier, multivariate linear regression and experimental testing for classification and characterization of wheat straw based on mechanical properties
    Naik, Dayakar L.
    Kiran, Ravi
    INDUSTRIAL CROPS AND PRODUCTS, 2018, 112 : 434 - 448
  • [30] A multi-core parallelization strategy for statistical significance testing in learning classifier systems
    Rudd J.
    Moore J.H.
    Urbanowicz R.J.
    Urbanowicz, R. J. (ryan.j.urbanowicz@dartmouth.edu), 1600, Springer Verlag (06): : 127 - 134