Statistical data integration in survey sampling: a review

被引:43
|
作者
Yang, Shu [1 ]
Kim, Jae Kwang [2 ]
机构
[1] North Carolina State Univ, Dept Stat, Raleigh, NC USA
[2] Iowa State Univ, Dept Stat, Ames, IA 50011 USA
基金
美国国家科学基金会;
关键词
Generalizability; Meta-analysis; Missing at random; Transportability; PROPENSITY SCORE; COMBINING INFORMATION; MULTIPLE SURVEYS; GENERALIZING EVIDENCE; ROBUST ESTIMATION; CAUSAL INFERENCE; MISSING DATA; PROBABILITY; CALIBRATION; IMPUTATION;
D O I
10.1007/s42081-020-00093-w
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Finite population inference is a central goal in survey sampling. Probability sampling is the main statistical approach to finite population inference. Challenges arise due to high cost and increasing non-response rates. Data integration provides a timely solution by leveraging multiple data sources to provide more robust and efficient inference than using any single data source alone. The technique for data integration varies depending on types of samples and available information to be combined. This article provides a systematic review of data integration techniques for combining probability samples, probability and non-probability samples, and probability and big data samples. We discuss a wide range of integration methods such as generalized least squares, calibration weighting, inverse probability weighting, mass imputation, and doubly robust methods. Finally, we highlight important questions for future research.
引用
收藏
页码:625 / 650
页数:26
相关论文
共 50 条
  • [21] Design-Unbiased Statistical Learning in Survey Sampling
    Sande, Luis Sanguiao
    Zhang, Li-Chun
    SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY, 2021, 83 (02): : 714 - 744
  • [22] ANNUAL SURVEY OF STATISTICAL TECHNIQUE DEVELOPMENTS IN SAMPLING THEORY
    Shewhart, W. A.
    ECONOMETRICA, 1933, 1 (03) : 225 - 237
  • [23] Design-Unbiased Statistical Learning in Survey Sampling
    Luis Sanguiao Sande
    Li-Chun Zhang
    Sankhya A, 2021, 83 : 714 - 744
  • [24] Statistical integration of tracking and vessel survey data to incorporate life history differences in habitat models
    Yamamoto, Takashi
    Watanuki, Yutaka
    Hazen, Elliott L.
    Nishizawa, Bungo
    Sasaki, Hiroko
    Takahashi, Akinori
    ECOLOGICAL APPLICATIONS, 2015, 25 (08) : 2394 - 2406
  • [25] Sampling for Big Data Profiling: A Survey
    Liu, Zhicheng
    Zhang, Aoqian
    IEEE ACCESS, 2020, 8 : 72713 - 72726
  • [26] Data structures, sampling and survey issues
    Richardson, AJ
    Wolf, J
    TRAVEL BEHAVIOUR RESEARCH: THE LEADING EDGE, 2001, : 267 - 277
  • [27] Integration of administrative data with survey and census data
    Trant, M
    Whitridge, P
    AGRICULTURAL STATISTICS 2000, PROCEEDINGS: AN INTERNATIONAL CONFERENCE ON AGRICULTURAL STATISTICS, 1998, : 107 - 114
  • [28] Analysis of the statistical error in umbrella sampling simulations by umbrella integration
    Kaestner, Johannes
    Thiel, Walter
    JOURNAL OF CHEMICAL PHYSICS, 2006, 124 (23):
  • [29] Scientific sinkhole: estimating the cost of peer review based on survey data with snowball sampling
    LeBlanc, Allana G.
    Barnes, Joel D.
    Saunders, Travis J.
    Tremblay, Mark S.
    Chaput, Jean-Philippe
    RESEARCH INTEGRITY AND PEER REVIEW, 2023, 8 (01)
  • [30] Scientific sinkhole: estimating the cost of peer review based on survey data with snowball sampling
    Allana G. LeBlanc
    Joel D. Barnes
    Travis J. Saunders
    Mark S. Tremblay
    Jean-Philippe Chaput
    Research Integrity and Peer Review, 8