A new approach for data editing and imputation

被引:0
|
作者
Sergio Delgado-Quintero
Juan-José Salazar-González
机构
[1] Universidad de La Laguna,DEIOC
关键词
Editing; Imputation; Error localization problem; Mathematical Programming; Heuristics;
D O I
暂无
中图分类号
学科分类号
摘要
The editing-and-imputation problem concerns the question of finding errors in a record which does not satisfy a set of consistency rules. Once some potential errors have been localizated, it is also necessary to impute new values to the associated fields. The output dataset should consist of valid records and preserve similar statistical properties as the input dataset. Most of this work is usually done manually by statistical agencies, thus consuming a great deal of human resources. This paper presents a mathematical programming model to optimally solve the problem on surveys with categorical values and particular edits. We also describe a heuristic approach to deal with the more complex surveys. The heuristic procedure follows a combination of the widely-accepted hot-deck donor scheme and the multivariate regression analysis. It has been implemented in a graphical user interface running on standard personal computers, and has been tested on real-world surveys. This paper demonstrates the satisfactory performance of our automatic procedure.
引用
收藏
相关论文
共 50 条
  • [31] WHEN MISSING DATA ARE NOT MISSING: A NEW APPROACH TO EVALUATING SUPPLEMENTAL HOMICIDE REPORT IMPUTATION STRATEGIES
    Wadsworth, Tim
    Roberts, John M., Jr.
    CRIMINOLOGY, 2008, 46 (04) : 841 - 870
  • [32] NEW APPROACH TO SYNCHRONIZATION AND SPLICELESS EDITING
    DEWILDE, C
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 1973, 21 (09): : 750 - 750
  • [33] APPROACH TO EDITING OF MASSIVE COMPUTERIZED DATA SOURCES
    HARRIS, KD
    ACCOLA, WV
    PROFESSIONAL GEOGRAPHER, 1972, 24 (03): : 240 - 241
  • [34] A New Missing Data Imputation Algorithm Applied to Electrical Data Loggers
    Crespo Turrado, Concepcion
    Sanchez Lasheras, Fernando
    Luis Calvo-Rolle, Jose
    Jose Pinon-Pazos, Andres
    de Cos Juez, Francisco Javier
    SENSORS, 2015, 15 (12) : 31069 - 31082
  • [35] New imputation methods for missing data using quantiles
    Munoz, J. F.
    Rueda, M.
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2009, 232 (02) : 305 - 317
  • [36] A New Method to Missing Value Imputation for Immunosignature Data
    Koshechkin, A. A.
    Andryushchenko, V. S.
    Zamyatin, A., V
    SOVREMENNYE TEHNOLOGII V MEDICINE, 2019, 11 (02) : 19 - 23
  • [37] A new scalable approach for missing value imputation in high-throughput microarray data on apache spark
    Gupta, Madhuri
    Gupta, Bharat
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2020, 23 (01) : 79 - 100
  • [38] A spatiotemporal approach for traffic data imputation with complicated missing patterns
    Li, Huiping
    Li, Meng
    Lin, Xi
    He, Fang
    Wang, Yinhai
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2020, 119
  • [39] A Kriging based spatiotemporal approach for traffic volume data imputation
    Yang, Hongtai
    Yang, Jianjiang
    Han, Lee D.
    Liu, Xiaohan
    Pu, Li
    Chin, Shih-miao
    Hwang, Ho-ling
    PLOS ONE, 2018, 13 (04):