Imputing Missing Values for Mixed Numeric and Categorical Attributes Based on Incomplete Data Hierarchical Clustering

被引:0
|
作者
Feng, Xiaodong [1 ]
Wu, Sen [1 ]
Liu, Yanchi [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Econ & Management, Beijing 100083, Peoples R China
关键词
Mixed Numeric and Categorical Attributes; Missing Value Imputation; Hierarchical Clustering; Incomplete Set Mixed Feature Vector;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Missing data imputation is a key issue of data pre-processing in data mining field. Though there are many methods for missing value imputation, almost each of these imputation methods has its limitation and is designed for either numeric attributes or categorical attributes. This paper presents IMIC, a new missing value Imputation method for Mixed numeric and categorical attributes based on Incomplete data hierarchical clustering after the introduction of a new concept Incomplete Set Mixed Feature Vector (ISMFV). The effect of the new method is valuated through the comparison experiment using 3 real data sets from UCI.
引用
收藏
页码:414 / 424
页数:11
相关论文
共 50 条
  • [1] Clustering algorithm for incomplete data sets with mixed numeric and categorical attributes
    Sen, Wu
    Hong, Chen
    Xiaodong, Feng
    International Journal of Database Theory and Application, 2013, 6 (05): : 95 - 104
  • [2] Entropy based clustering of data streams with mixed numeric and categorical values
    Wang, Shuyun
    Fan, Yingjie
    Zhang, Chenghong
    Xu, HeXiang
    Hao, Xiulan
    Hu, Yunfa
    7TH IEEE/ACIS INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE IN CONJUNCTION WITH 2ND IEEE/ACIS INTERNATIONAL WORKSHOP ON E-ACTIVITY, PROCEEDINGS, 2008, : 140 - +
  • [3] Algorithm for fuzzy clustering of mixed data with numeric and categorical attributes
    Ahmad, A
    Dey, L
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, PROCEEDINGS, 2005, 3816 : 561 - 572
  • [4] Clustering mixed numerical and categorical data with missing values
    Dinh, Duy-Tai
    Huynh, Van-Nam
    Sriboonchitta, Songsak
    INFORMATION SCIENCES, 2021, 571 : 418 - 442
  • [5] A CLUSTERING ALGORITHM FOR MIXED NUMERIC AND CATEGORICAL DATA
    Ohn Mar San
    Van-Nam Huynh
    Yoshiteru Nakamori
    JournalofSystemsScienceandComplexity, 2003, (04) : 562 - 571
  • [6] A GA-based clustering algorithm for large data sets with mixed numeric and categorical values
    Li, J
    Gao, XB
    Jiao, LC
    ICCIMA 2003: FIFTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, PROCEEDINGS, 2003, : 102 - 107
  • [7] A GA-based clustering algorithm for large data sets with mixed numeric and categorical values
    Li, J
    Gao, XB
    Jiao, LC
    THIRD INTERNATIONAL SYMPOSIUM ON MULTISPECTRAL IMAGE PROCESSING AND PATTERN RECOGNITION, PTS 1 AND 2, 2003, 5286 : 171 - 174
  • [8] Clustering based on compressed data for categorical and mixed attributes
    Rendon, Erendira
    Sanchez, Jose Salvador
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS, 2006, 4109 : 817 - 825
  • [9] Clustering Mixed Numeric and Categorical Data With Cuckoo Search
    Ji, Jinchao
    Pang, Wei
    Li, Zairong
    He, Fei
    Feng, Guozhong
    Zhao, Xiaowei
    IEEE ACCESS, 2020, 8 : 30988 - 31003
  • [10] An Initialization Method for Clustering Mixed Numeric and Categorical Data Based on the Density and Distance
    Ji, Jinchao
    Pang, Wei
    Zheng, Yanlin
    Wang, Zhe
    Ma, Zhiqiang
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2015, 29 (07)