K-Nearest Neighbor (K-NN) based Missing Data Imputation

被引:0
|
作者
Murti, Della Murbarani Prawidya [1 ]
Wibawa, Aji Prasetya [1 ]
Akbar, Muhammad Iqbal [1 ]
Ianto, Utomo Puj [1 ]
机构
[1] State Univ Malang, Elect Engn Dept, Malang, Indonesia
关键词
Missing Data; Imputation; k-Nearest Neighbor; Naive Bayes Classifier;
D O I
10.1109/icsitech46713.2019.8987530
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The performance of the classification algorithm depends on the quality of the training data. Data quality is an important factor that affects the data mining classification results. However, one problems that often found is missing data. Effect many missing data is a less optimal classification model. Because it is can deletes important information that affect the performance of the algorithm. One method used to recover missing data is to fill it, as known as imputation. This study uses the K-NN method as an imputation carried out in several cases that have different mechanisms and missing data model. On these imputed dataset then apply classification with Naive Bayes algorithm. In this study, analyzes the performance of imputation method using Naive Bayes algorithm on the basis of accuracy for handling missing data. The results, handling missing data with K-NN-based imputation can reach the accuracy of complete data in each case with a low accuracy difference.
引用
收藏
页码:83 / 88
页数:6
相关论文
共 50 条
  • [1] Differentially Private k-Nearest Neighbor Missing Data Imputation
    Clifton, Chris
    Hanson, Eric J.
    Merrill, Keith
    Merrill, Shawn
    ACM TRANSACTIONS ON PRIVACY AND SECURITY, 2022, 25 (03)
  • [2] The Grading of Agarwood Oil Quality using k-Nearest Neighbor (k-NN)
    Ismail, Nurlaila
    Rahiman, Mohd Hezri Fazalul
    Taib, Mohd Nasir
    Ali, Nor Azah Mohd
    Jamil, Mailina
    Tajuddin, Saiful Nizam
    2013 IEEE CONFERENCE ON SYSTEMS, PROCESS & CONTROL (ICSPC), 2013, : 1 - 5
  • [3] Estimation of Forestry-Biomass using k-Nearest Neighbor(k-NN) method
    Lee, Jung-soo
    Yoshida, Shigejiro
    JOURNAL OF THE FACULTY OF AGRICULTURE KYUSHU UNIVERSITY, 2013, 58 (02): : 339 - 349
  • [4] Application of the k-nearest neighbor (k-NN) machine learning algorithm for the identification of colorectal cancer based on microRNAs
    Fajar, Rifaldy
    Kurniastuti, Nana Indri
    Jupri, Prihantini
    Wulandari, Titik
    JOURNAL OF GASTROENTEROLOGY AND HEPATOLOGY, 2021, 36 : 54 - 54
  • [5] Predicting persistence in the sediment compartment with a new automatic software based on the k-Nearest Neighbor (k-NN) algorithm
    Manganaro, Alberto
    Pizzo, Fabiola
    Lombardo, Anna
    Pogliaghi, Alberto
    Benfenati, Emilio
    CHEMOSPHERE, 2016, 144 : 1624 - 1630
  • [6] Imputation of missing values in well log data using k-nearest neighbor collaborative filtering
    Kim, Min Jun
    Cho, Yongchae
    COMPUTERS & GEOSCIENCES, 2024, 193
  • [7] SENTIMENT ANALYSIS ON USER SATISFACTION LEVEL OF CELLULAR DATA SERVICE USING THE K-NEAREST NEIGHBOR (K-NN) ALGORITHM
    Wibawa, Desdwyatma Wahyu
    Nasrun, Muhammad
    Setianingsih, Casi
    2018 INTERNATIONAL CONFERENCE ON CONTROL, ELECTRONICS, RENEWABLE ENERGY AND COMMUNICATIONS (ICCEREC), 2018, : 235 - 241
  • [8] Handling Missing Strain (Rate) Curves Using K-Nearest Neighbor Imputation
    Tabassian, Mahdi
    Alessandrini, Martino
    Jasaityte, Ruta
    De Marchi, Luca
    Masetti, Guido
    D'hooge, Jan
    2016 IEEE INTERNATIONAL ULTRASONICS SYMPOSIUM (IUS), 2016,
  • [9] Principal Component Analysis (PCA)-Based k-Nearest Neighbor (k-NN) Analysis of Colonic Mucosal Tissue Fluorescence Spectra
    Kamath, Sudha D.
    Mahato, Krishna K.
    PHOTOMEDICINE AND LASER SURGERY, 2009, 27 (04) : 659 - 668
  • [10] A novel ranked k-nearest neighbors algorithm for missing data imputation
    Khan, Yasir
    Shah, Said Farooq
    Asim, Syed Muhammad
    JOURNAL OF APPLIED STATISTICS, 2024,