A new approach for data editing and imputation

被引:0
|
作者
Sergio Delgado-Quintero
Juan-José Salazar-González
机构
[1] Universidad de La Laguna,DEIOC
关键词
Editing; Imputation; Error localization problem; Mathematical Programming; Heuristics;
D O I
暂无
中图分类号
学科分类号
摘要
The editing-and-imputation problem concerns the question of finding errors in a record which does not satisfy a set of consistency rules. Once some potential errors have been localizated, it is also necessary to impute new values to the associated fields. The output dataset should consist of valid records and preserve similar statistical properties as the input dataset. Most of this work is usually done manually by statistical agencies, thus consuming a great deal of human resources. This paper presents a mathematical programming model to optimally solve the problem on surveys with categorical values and particular edits. We also describe a heuristic approach to deal with the more complex surveys. The heuristic procedure follows a combination of the widely-accepted hot-deck donor scheme and the multivariate regression analysis. It has been implemented in a graphical user interface running on standard personal computers, and has been tested on real-world surveys. This paper demonstrates the satisfactory performance of our automatic procedure.
引用
收藏
相关论文
共 50 条
  • [41] A Novel Imputation Approach for Sharing Protected Public Health Data
    Erdman, Elizabeth A.
    Young, Leonard D.
    Bernson, Dana L.
    Bauer, Cici
    Chui, Kenneth
    Stopka, Thomas J.
    AMERICAN JOURNAL OF PUBLIC HEALTH, 2021, 111 (10) : 1830 - 1838
  • [42] A new approach for disclosure control in the IAB establishment panel-multiple imputation for a better data access
    Drechsler, Joerg
    Dundler, Agnes
    Bender, Stefan
    Raessler, Susanne
    Zwick, Thomas
    ASTA-ADVANCES IN STATISTICAL ANALYSIS, 2008, 92 (04) : 439 - 458
  • [43] An Enhanced Imputation Approach for Spatio-Temporal Clinical Data
    Yin, Yilin
    Chou, Chun-An
    2022 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2022, : 813 - 818
  • [44] TRIP: An Interactive Retrieving-Inferring Data Imputation Approach
    Li, Zhixu
    Qin, Lu
    Cheng, Hong
    Zhang, Xiangliang
    Zhou, Xiaofang
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1462 - 1463
  • [45] From Predictive Methods to Missing Data Imputation: An Optimization Approach
    Bertsimas, Dimitris
    Pawlowski, Colin
    Zhuo, Ying Daisy
    JOURNAL OF MACHINE LEARNING RESEARCH, 2018, 18
  • [46] A Bayesian tensor decomposition approach for spatiotemporal traffic data imputation
    Chen, Xinyu
    He, Zhaocheng
    Sun, Lijun
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2019, 98 : 73 - 84
  • [47] Editing-Enabled Signatures: A New Tool for Editing Authenticated Data
    Sengupta, Binanda
    Li, Yingjiu
    Tian, Yangguang
    Deng, Robert H.
    IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (06): : 4997 - 5007
  • [48] A Graph-Based Approach for Missing Sensor Data Imputation
    Jiang, Xiao
    Tian, Zean
    Li, Kenli
    IEEE SENSORS JOURNAL, 2021, 21 (20) : 23133 - 23144
  • [49] TRIP: An Interactive Retrieving-Inferring Data Imputation Approach
    Li, Zhixu
    Qin, Lu
    Cheng, Hong
    Zhang, Xiangliang
    Zhou, Xiaofang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (09) : 2550 - 2563
  • [50] Categorical missing data imputation approach via sparse representation
    Shao, Xiaochen
    Wu, Sen
    Feng, Xiaodong
    Song, Rui
    INTERNATIONAL JOURNAL OF SERVICES TECHNOLOGY AND MANAGEMENT, 2016, 22 (3-5) : 256 - 270