An efficient approach for imputation and classification of medical data values using class-based clustering of medical records

被引:33
|
作者
Yelipe, UshaRani [1 ]
Porika, Sammulal [3 ]
Golla, Madhu [2 ]
机构
[1] VNR Vignana Jyothi Inst Engn & Technol, Hyderabad, Andhra Prades, India
[2] VNR Vignana Jyothi Inst Engn & Technol, Dept Informat Technol, Hyderabad, Andhra Prades, India
[3] JNTUH Coll Engn, Karimnagar, India
关键词
Imputation; Medical record; Clustering; Classifiers; Missing values; Prediction; MISSING VALUE ESTIMATION;
D O I
10.1016/j.compeleceng.2017.11.030
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Medical data is usually not free from missing values and this is also true when data is collected and sampled through various clinical trials. Existing Imputation techniques do not address the problem of high dimensionality and apply distance functions that also have the curse of high dimensionality. There is a need to turn up with innovative approaches and methods for accurate and efficient analysis of medical records. This research proposes an improved imputation approach called IM-CBC (Imputation based on class-based clustering) and a classifier termed as the Class-Based-Clustering Classifier(CBCC-IM). Experiments are performed on nine benchmark datasets and the recorded results using IM-CBC imputation approach are compared to ten imputation approaches using classifiers KNN, SVM and C4.5 and to the CBCC classifier using Euclidean distance and fuzzy gaussian similarity functions. Results obtained prove that the performance of classifiers is improved or atleast nearer to the existing approaches. CBCC-IM classifier records highest accuracy when compared to all other classifiers on benchmark datasets such as Cleveland, Ecoli, Iris, Pima, Wine and Wisconsin. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:487 / 504
页数:18
相关论文
共 50 条
  • [1] Clustering-based approach for medical data classification
    Kodabagi, Mallikarjun M.
    Tikotikar, Ahelam
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (14):
  • [2] A novel clustering-based purity and distance imputation for handling medical data with missing values
    Ching-Hsue Cheng
    Shu-Fen Huang
    Soft Computing, 2021, 25 : 11781 - 11801
  • [3] A novel clustering-based purity and distance imputation for handling medical data with missing values
    Cheng, Ching-Hsue
    Huang, Shu-Fen
    SOFT COMPUTING, 2021, 25 (17) : 11781 - 11801
  • [4] Fast Nearest Neighbor classification using class-based clustering
    Chen, Tung-Shou
    Chiu, Yung-Hsing
    Lin, Chih-Chiang
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 1894 - +
  • [5] An Imputation Measure For Data Imputation and Disease Classification of Medical Datasets
    Aljawarneh, Shadi
    Radhakrishna, Vangipuram
    Kumar, Gunupudi Rajesh
    INTERNATIONAL CONFERENCE ON KEY ENABLING TECHNOLOGIES (KEYTECH 2019), 2019, 2146
  • [6] Using class-based feature selection for the classification of hyperspectral data
    Maghsoudi, Yasser
    Zoej, Mohammad Javad Valadan
    Collins, Michael
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2011, 32 (15) : 4311 - 4326
  • [7] An Imputation Approach to Electronic Medical Records Based on Time Series and Feature Association
    Yin, Y. F.
    Yuan, Z. W.
    Yang, J. X.
    Bao, X. J.
    12TH ASIAN-PACIFIC CONFERENCE ON MEDICAL AND BIOLOGICAL ENGINEERING, VOL 2, APCMBE 2023, 2024, 104 : 259 - 276
  • [8] Fast and Efficient Text Classification with Class-based Embeddings
    Wehrmann, Jonatas
    Kolling, Camila
    Barros, Rodrigo C.
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [9] Multilevel Stochastic Optimization for Imputation in Massive Medical Data Records
    Li, Wenrui
    Wang, Xiaoyu
    Sun, Yuetian
    Milanovic, Snezana
    Kon, Mark
    Castrillon-Candas, Julio Enrique
    IEEE TRANSACTIONS ON BIG DATA, 2024, 10 (02) : 122 - 131
  • [10] A Novel Approach for Imputation of Missing Values for Mining Medical Datasets
    UshaRani, Yelipe
    Sammulal, P.
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2015, : 721 - 728