Data mining and the impact of missing data

被引:95
|
作者
Brown, ML [1 ]
Kros, JF
机构
[1] Hawaii Pacific Univ, Sch Business, Honolulu, HI USA
[2] E Carolina Univ, Dept Decis Sci, Greenville, NC USA
关键词
data handling; database management systems; information gathering; information retrieval;
D O I
10.1108/02635570310497657
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The actual data mining process deals significantly with prediction, estimation, classification, pattern recognition and the development of association rules. Therefore, the significance of the analysis depends heavily on the accuracy of the database and on the chosen sample data to be used for model training and testing. Data mining is based upon searching the concatenation of multiple databases that usually contain some amount of missing data along with a variable percentage of inaccurate data, pollution, outliers and noise. The issue of missing data must be addressed since ignoring this problem can introduce bias into the models being evaluated and lead to inaccurate data mining conclusions. The objective of this research is to address the impact of missing data on the data mining process.
引用
收藏
页码:611 / 621
页数:11
相关论文
共 50 条
  • [1] Data mining of missing persons data
    Blackmore, K
    Bossomaier, T
    Foy, S
    Thomson, D
    CLASSIFICATION AND CLUSTERING FOR KNOWLEDGE DISCOVERY, 2005, 4 : 305 - 314
  • [2] Missing Data in Collaborative Data Mining
    Anton, Carmen Ana
    Matei, Oliviu
    Avram, Anca
    COMPUTATIONAL STATISTICS AND MATHEMATICAL MODELING METHODS IN INTELLIGENT SYSTEMS, VOL. 2, 2019, 1047 : 100 - 109
  • [3] The association rule algorithm with missing data in data mining
    Gerardo, BD
    Lee, J
    Lee, J
    Park, M
    Lee, M
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2004, PT 1, 2004, 3043 : 97 - 105
  • [4] Dealing with Missing Data and Uncertainty in the Context of Data Mining
    Aleryani, Aliya
    Wang, Wenjia
    De La Iglesia, Beatriz
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS (HAIS 2018), 2018, 10870 : 289 - 301
  • [5] Missing Data: The Importance and Impact of Missing Data from Clinical Research
    Padgett, Christine R.
    Skilbeck, Clive E.
    Summers, Mathew James
    BRAIN IMPAIRMENT, 2014, 15 (01) : 1 - 9
  • [6] Mining for equitable health: Assessing the impact of missing data in electronic health records
    Getzen, Emily
    Ungar, Lyle
    Mowery, Danielle
    Jiang, Xiaoqian
    Long, Qi
    JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 139
  • [7] Missing data: the impact of what is not there
    Groenwold, Rolf H. H.
    Dekkers, Olaf M.
    EUROPEAN JOURNAL OF ENDOCRINOLOGY, 2020, 183 (04) : E7 - E9
  • [8] Research on Missing Value Estimation in Data Mining
    Feng, Deng-Chao
    Wang, Zhe
    Shi, Jian-Fang
    Pereira, J. M. Dias
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 2048 - +
  • [9] A generic neural network approach for filling missing data in data mining
    Wei, W
    Tang, Y
    2003 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2003, : 862 - 867
  • [10] A Robust Missing Data-Recovering Technique for Mobility Data Mining
    Zafar, Annam
    Kamran, Muhammad
    Shad, Shafqat Ali
    Nisar, Wasif
    APPLIED ARTIFICIAL INTELLIGENCE, 2017, 31 (5-6) : 425 - 438