Testing the statistical significance of an ultra-high-dimensional naive Bayes classifier

被引:0
|
作者
An, Baiguo [1 ]
Wang, Hansheng [1 ]
Guo, Jianhua [1 ]
机构
[1] Peking Univ, Guanghua Sch Management, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Binary Predictor; Hypothesis Testing; Naive Bayes; Supervised Learning; Text Classification; Ultra-High-Dimensional Data; SELECTION;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The naive Bayes approach is one of the most popular methods used for classification. Nevertheless, how to test its statistical significance under an ultra-high-dimensional (UHD) setup is not well understood. To fill this important theoretical gap, we propose a novel testing statistic with a standard normal asymptotic null distribution, even if the predictor dimension is considerably larger than the sample size. This makes the proposed method useful for UHD data analysis. Simulation studies are presented to demonstrate its finite sample performance and a text classification example is described for illustration.
引用
收藏
页码:223 / 229
页数:7
相关论文
共 50 条
  • [1] Adaptive Testing and Performance Analysis using Naive Bayes Classifier
    Agarwal, Sanjana
    Jain, Nirav
    Dholay, Surekha
    INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING TECHNOLOGIES AND APPLICATIONS (ICACTA), 2015, 45 : 70 - 75
  • [2] Variable selection for ultra-high-dimensional logistic models
    Du, Pang
    Wu, Pan
    Liang, Hua
    PERSPECTIVES ON BIG DATA ANALYSIS: METHODOLOGIES AND APPLICATIONS, 2014, 622 : 141 - 158
  • [3] Additive partially linear models for ultra-high-dimensional regression
    Li, Xinyi
    Wang, Li
    Nettleton, Dan
    STAT, 2019, 8 (01):
  • [4] Testing predictor significance with ultra high dimensional multivariate responses
    Ma, Yingying
    Lan, Wei
    Wang, Hansheng
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2015, 83 : 275 - 286
  • [5] Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models
    Fan, Jianqing
    Feng, Yang
    Song, Rui
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (494) : 544 - 557
  • [6] Group screening for ultra-high-dimensional feature under linear model
    Niu, Yong
    Zhang, Riquan
    Liu, Jicai
    Li, Huapeng
    STATISTICAL THEORY AND RELATED FIELDS, 2020, 4 (01) : 43 - 54
  • [7] Predictive model for admission uncertainty in high education using Naive Bayes classifier
    Rawal, Atul
    Lal, Bechoo
    JOURNAL OF INDIAN BUSINESS RESEARCH, 2023, 15 (02) : 262 - 277
  • [8] Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Varying Coefficient Models
    Fan, Jianqing
    Ma, Yunbei
    Dai, Wei
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2014, 109 (507) : 1270 - 1284
  • [9] Feasibility of ultra-high-dimensional flow imaging for rapid pediatric cardiopulmonary MRI
    Joseph Y Cheng
    Tao Zhang
    John M Pauly
    Shreyas Vasanawala
    Journal of Cardiovascular Magnetic Resonance, 18 (Suppl 1)
  • [10] Ultra-high-dimensional multi-level optimisation strategies for electrical machines
    Liu, Chengcheng
    Zhang, Shiwei
    Zhang, Hongming
    Wang, Youhua
    Liu, Lin
    IET ELECTRIC POWER APPLICATIONS, 2024, 18 (11) : 1507 - 1517