Pre-Processing Methods of Data Mining

被引:0
|
作者
Saleem, Asma [1 ]
Asif, Khadim Hussain [1 ]
Ali, Ahmad [2 ]
Awan, Shahid Mahmood [3 ]
AlGhamdi, Mohammed A. [4 ]
机构
[1] Univ Engn & Technol, Dept Comp Sci & Engn, Lahore, Pakistan
[2] COMSATS Inst Informat Technol, Dept Biosci, Sahiwal, Pakistan
[3] Univ Engn & Technol, Al Khawarizmi Inst Comp Sci, Lahore, Pakistan
[4] Umm Al Qura Univ, Inst Innovat & Entrepreneurship, Mecca, Saudi Arabia
关键词
data pre-processing; data mining; outliers; missing values;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data generation, handling and its processing have emerged as the most reliable source of understanding and discovery of new facts, knowledge and products in the world of natural and material sciences. The emergence of the most efficient techniques in statistical or bioinformatics situations has therefore become a routine practice in research and industrial sectors. Under practical conditions, dealing with large datasets, it's likely to have inconsistencies and anomalies of all kinds to prevent to know real outcomes for practical problems. For accurate data mining computer based techniques of data pre-processing offer solutions that help the data under processing to conform normal structures which in turn considerably improve the performance of machine learning algorithms. In this process, accurate determination of outliers, extreme values and filling up gaps poses formidable challenges. Multiple methodologies have therefore been developed to detect these deviated or inconsistent values called outliers. Different data pre-processing techniques discussed in this paper could offer most suitable solutions for handling missing values and outliers in all kinds of large datasets such as electric load and weather datasets.
引用
收藏
页码:451 / 456
页数:6
相关论文
共 50 条
  • [1] A framework of irregularity enlightenment for data pre-processing in data mining
    Au, Siu-Tong
    Duan, Rong
    Hesar, Siamak G.
    Jiang, Wei
    ANNALS OF OPERATIONS RESEARCH, 2010, 174 (01) : 47 - 66
  • [2] A framework of irregularity enlightenment for data pre-processing in data mining
    Siu-Tong Au
    Rong Duan
    Siamak G. Hesar
    Wei Jiang
    Annals of Operations Research, 2010, 174 : 47 - 66
  • [3] Toward databases mining: Pre-processing collected data
    Yan, XW
    Zhang, CQ
    Zhang, SC
    APPLIED ARTIFICIAL INTELLIGENCE, 2003, 17 (5-6) : 545 - 561
  • [4] Survey of Pre-processing Techniques for Mining Big Data
    Hariharakrishnan, Jayaram
    Mohanavalli, S.
    Srividya
    Kumar, Sundhara K. B.
    2017 INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND SIGNAL PROCESSING (ICCCSP), 2017, : 77 - 81
  • [5] A systematic review of data pre-processing methods and unsupervised mining methods used in profiling smart meter data
    Dahunsi, Folasade M.
    Olawumi, Abayomi E.
    Ale, Daniel T.
    Sarumi, Oluwafemi A.
    AIMS Electronics and Electrical Engineering, 2021, 5 (04): : 284 - 314
  • [6] Analysis of Pre-processing and Post-processing Methods and Using Data Mining to Diagnose Heart Diseases
    Hamidi, H.
    Daraei, A.
    INTERNATIONAL JOURNAL OF ENGINEERING, 2016, 29 (07): : 921 - 930
  • [7] A survey on pre-processing and post-processing techniques in data mining
    Tomar, Divya
    Agarwal, Sonali
    International Journal of Database Theory and Application, 2014, 7 (04): : 99 - 128
  • [8] A data pre-processing method to increase efficiency and accuracy in data mining
    Razavi, AR
    Gill, H
    Åhlfeldt, H
    Shahsavar, N
    ARTIFICIAL INTELLIGENCE IN MEDICINE, PROCEEDINGS, 2005, 3581 : 434 - 443
  • [9] Methods for pre-processing smartcard data to improve data quality
    Robinson, Steve
    Narayanan, Baskaran
    Toh, Nelson
    Pereira, Francisco
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2014, 49 : 43 - 58
  • [10] Data pre-processing to improve the mining of large feed databases
    Maroto-Molina, F.
    Gomez-Cabrera, A.
    Guerrero-Ginel, J. E.
    Garrido-Varo, A.
    Sauvant, D.
    Tran, G.
    Heuze, V.
    Perez-Marin, D. C.
    ANIMAL, 2013, 7 (07) : 1128 - 1136