A fuzzy K-nearest neighbor classifier to deal with imperfect data

被引:17
|
作者
Cadenas, Jose M. [1 ]
Carmen Garrido, M. [1 ]
Martinez, Raquel [2 ]
Munoz, Enrique [3 ]
Bonissone, Piero P. [4 ]
机构
[1] Univ Murcia, Dept Informat & Commun Engn, Murcia, Spain
[2] Catholic Univ Murcia, Dept Comp Engn, Murcia, Spain
[3] Univ Milan, Dept Comp Sci, Crema, Italy
[4] Piero P Bonissone Analyt LLC, San Diego, CA USA
关键词
k-nearest neighbors; Classification; Imperfect data; Distance/dissimilarity measures; Combination methods; PERFORMANCE; RULES; ALGORITHMS;
D O I
10.1007/s00500-017-2567-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The k-nearest neighbors method (kNN) is a nonparametric, instance-based method used for regression and classification. To classify a new instance, the kNN method computes its k nearest neighbors and generates a class value from them. Usually, this method requires that the information available in the datasets be precise and accurate, except for the existence of missing values. However, data imperfection is inevitable when dealing with real-world scenarios. In this paper, we present the kNN(imp) classifier, a k-nearest neighbors method to perform classification from datasets with imperfect value. The importance of each neighbor in the output decision is based on relative distance and its degree of imperfection. Furthermore, by using external parameters, the classifier enables us to define the maximum allowed imperfection, and to decide if the final output could be derived solely from the greatest weight class (the best class) or from the best class and a weighted combination of the closest classes to the best one. To test the proposed method, we performed several experiments with both synthetic and real-world datasets with imperfect data. The results, validated through statistical tests, show that the kNN(imp) classifier is robust when working with imperfect data and maintains a good performance when compared with other methods in the literature, applied to datasets with or without imperfection.
引用
收藏
页码:3313 / 3330
页数:18
相关论文
共 50 条
  • [41] Finger Vein Identification using Fuzzy-based k-Nearest Centroid Neighbor Classifier
    Rosdi, Bakhtiar Affendi
    Jaafar, Haryati
    Ramli, Dzati Athiar
    2ND ISM INTERNATIONAL STATISTICAL CONFERENCE 2014 (ISM-II): EMPOWERING THE APPLICATIONS OF STATISTICAL AND MATHEMATICAL SCIENCES, 2015, 1643 : 649 - 654
  • [42] A Pruned Fuzzy k-Nearest Neighbor Classifier with Application to Electrocardiogram Based Cardiac Arrhytmia Recognition
    Afsar, Fayyaz A.
    Akram, M. U.
    Arif, M.
    Khurshid, J.
    INMIC: 2008 INTERNATIONAL MULTITOPIC CONFERENCE, 2008, : 143 - 148
  • [43] An Enhancement of Fuzzy K-Nearest Neighbor Classifier Using Multi-Local Power Means
    Kumbure, Mahinda Mailagaha
    Luukka, Pasi
    Collan, Mikael
    PROCEEDINGS OF THE 11TH CONFERENCE OF THE EUROPEAN SOCIETY FOR FUZZY LOGIC AND TECHNOLOGY (EUSFLAT 2019), 2019, 1 : 83 - 90
  • [44] An instance selection algorithm for fuzzy K-nearest neighbor
    Zhai, Junhai
    Qi, Jiaxing
    Zhang, Sufang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (01) : 521 - 533
  • [45] Random projections fuzzy k-nearest neighbor(RPFKNN) for big data classification
    Popescu, Mihail
    Keller, James M.
    2016 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2016, : 1813 - 1817
  • [46] An Optimized Hybrid Fuzzy Weighted k-Nearest Neighbor with the Presence of Data Imbalance
    Bahanshal, Soha A.
    Baraka, Rebhi S.
    Kim, Bayong
    Verdhan, Vaibhav
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (04) : 660 - 665
  • [47] Research on the Improvement of K-Nearest Neighbor Classifier for Imbalanced Text Categorization
    Yang Yanmei
    Xu Linying
    2018 EIGHTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2018), 2018, : 968 - 972
  • [48] Boosting k-nearest neighbor classifier by means of input space projection
    Garcia-Pedrajas, Nicolas
    Ortiz-Boyer, Domingo
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (07) : 10570 - 10582
  • [49] Enhancing Patient Safety Event Reporting by K-nearest Neighbor Classifier
    Liang, Chen
    Gong, Yang
    CONTEXT SENSITIVE HEALTH INFORMATICS: MANY PLACES, MANY USERS, MANY CONTEXTS, MANY USES, 2015, 218 : 93 - 99
  • [50] Detection and Localization of Myocardial Infarction using K-nearest Neighbor Classifier
    Muhammad Arif
    Ijaz A. Malagore
    Fayyaz A. Afsar
    Journal of Medical Systems, 2012, 36 : 279 - 289