A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering

被引:99
|
作者
Boutemedjet, Sabri [1 ]
Bouguila, Nizar [2 ]
Ziou, Djemel [1 ]
机构
[1] Univ Sherbrooke, Dept Informat, Sherbrooke, PQ J1K 2R1, Canada
[2] Concordia Univ, Concordia Inst Informat Engn CIISE, Montreal, PQ H3G 1T7, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Unsupervised learning; mixture models; feature selection; dimensionality reduction; generalized Dirichlet mixture; EM; MML; information theory; object image categorization; STATISTICAL PATTERN-RECOGNITION; DIRICHLET MIXTURE MODEL; UNSUPERVISED SELECTION;
D O I
10.1109/TPAMI.2008.155
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an unsupervised approach for feature selection and extraction in mixtures of generalized Dirichlet (GD) distributions. Our method defines a new mixture model that is able to extract independent and non-Gaussian features without loss of accuracy. The proposed model is learned using the Expectation-Maximization algorithm by minimizing the message length of the data set. Experimental results show the merits of the proposed methodology in the categorization of object images.
引用
收藏
页码:1429 / 1443
页数:15
相关论文
共 50 条
  • [31] Feature selection for high-dimensional temporal data
    Tsagris, Michail
    Lagani, Vincenzo
    Tsamardinos, Ioannis
    BMC BIOINFORMATICS, 2018, 19
  • [32] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    ECTA 2011/FCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION THEORY AND APPLICATIONS AND INTERNATIONAL CONFERENCE ON FUZZY COMPUTATION THEORY AND APPLICATIONS, 2011,
  • [33] FEATURE-SELECTION FOR AUTOMATIC CLASSIFICATION OF NON-GAUSSIAN DATA
    FOROUTAN, I
    SKLANSKY, J
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1987, 17 (02): : 187 - 198
  • [34] Dynamical modeling for non-Gaussian data with high-dimensional sparse ordinary differential equations
    Nanshan, Muye
    Zhang, Nan
    Xun, Xiaolei
    Cao, Jiguo
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2022, 173
  • [35] High-dimensional rank-based graphical models for non-Gaussian functional data
    Solea, Eftychia
    Al Hajj, Rayan
    STATISTICS, 2023, : 388 - 422
  • [36] A double regression method for graphical modeling of high-dimensional nonlinear and non-Gaussian data
    Liang, Siqi
    Liang, Faming
    STATISTICS AND ITS INTERFACE, 2024, 17 (04) : 669 - 680
  • [37] Feature Selection for Clustering on High Dimensional Data
    Zeng, Hong
    Cheung, Yiu-ming
    PRICAI 2008: TRENDS IN ARTIFICIAL INTELLIGENCE, 2008, 5351 : 913 - 922
  • [38] Feature selection using symmetric uncertainty and hybrid optimization for high-dimensional data
    Lin Sun
    Shujing Sun
    Weiping Ding
    Xinyue Huang
    Peiyi Fan
    Kunyu Li
    Leqi Chen
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 4339 - 4360
  • [39] FACO: A Novel Hybrid Feature Selection Algorithm for High-Dimensional Data Classification
    Popoola, Gideon
    Oyeniran, Kayode
    SOUTHEASTCON 2024, 2024, : 61 - 68
  • [40] Feature selection using symmetric uncertainty and hybrid optimization for high-dimensional data
    Sun, Lin
    Sun, Shujing
    Ding, Weiping
    Huang, Xinyue
    Fan, Peiyi
    Li, Kunyu
    Chen, Leqi
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (12) : 4339 - 4360