Scaling Data from Multiple Sources

被引:0
|
作者
Enamorado, Ted [1 ]
Lopez-Moctezuma, Gabriel [2 ]
Ratkovic, Marc [3 ]
机构
[1] Washington Univ, Dept Polit Sci, St Louis, MO 63130 USA
[2] CALTECH, Div Humanities & Social Sci, Pasadena, CA 91125 USA
[3] Princeton Univ, Dept Polit, Princeton, NJ 08544 USA
关键词
multidimensional scaling; principal component analysis; U; S; Senate; BAYESIAN FACTOR-ANALYSIS; MODELS; PREFERENCES; LIKELIHOOD; FRAMEWORK;
D O I
10.1017/pan.2020.24
中图分类号
D0 [政治学、政治理论];
学科分类号
0302 ; 030201 ;
摘要
We introduce a method for scaling two datasets from different sources. The proposed method estimates a latent factor common to both datasets as well as an idiosyncratic factor unique to each. In addition, it offers a flexible modeling strategy that permits the scaled locations to be a function of covariates, and efficient implementation allows for inference through resampling. A simulation study shows that our proposed method improves over existing alternatives in capturing the variation common to both datasets, as well as the latent factors specific to each. We apply our proposed method to vote and speech data from the 112th U.S. Senate. We recover a shared subspace that aligns with a standard ideological dimension running from liberals to conservatives, while recovering the words most associated with each senator's location. In addition, we estimate a word-specific subspace that ranges from national security to budget concerns, and a vote-specific subspace with Tea Party senators on one extreme and senior committee leaders on the other.
引用
收藏
页码:212 / 235
页数:24
相关论文
共 50 条
  • [41] LARGE SAMPLE THEORY FOR MERGED DATA FROM MULTIPLE SOURCES
    Saegusa, Takumi
    ANNALS OF STATISTICS, 2019, 47 (03): : 1585 - 1615
  • [42] Arterial incident detection integrating data from multiple sources
    Bhandari, Nikhil
    Koppelman, Frank S.
    Schofer, Joseph L.
    Sethi, Vaneet
    Ivan, John N.
    Transportation Research Record, 1995, (1510): : 60 - 69
  • [43] Exploring traffic congestion correlation from multiple data sources
    Wang, Yuqi
    Cao, Jiannong
    Li, Wengen
    Gu, Tao
    Shi, Wenzhong
    PERVASIVE AND MOBILE COMPUTING, 2017, 41 : 470 - 483
  • [44] A framework for reconciling attribute values from multiple data sources
    Jiang, Zhengrui
    Sarkar, Sumit
    De, Prabuddha
    Dey, Debabrata
    MANAGEMENT SCIENCE, 2007, 53 (12) : 1946 - 1963
  • [45] Merging data from multiple sources: pretest and shrinkage perspectives
    Shah, Muhammad Kashif Ali
    Lisawadi, Supranee
    Ahmed, S. Ejaz
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2017, 87 (08) : 1577 - 1592
  • [46] Consolidating CCDs from multiple data sources: a modular approach
    Hosseini, Masoud
    Meade, Jonathan
    Schnitzius, Jamie
    Dixon, Brian E.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2016, 23 (02) : 317 - 323
  • [47] Marine reserve spillover: Modelling from multiple data sources
    Bellier, Edwige
    Neubauer, Philipp
    Monestiez, Pascal
    Letourneur, Yves
    Ledireach, Laurence
    Bonhomme, Patrick
    Bachet, Frederic
    ECOLOGICAL INFORMATICS, 2013, 18 : 188 - 193
  • [48] Learning Generative Adversarial Networks from Multiple Data Sources
    Trung Le
    Quan Hoang
    Hung Vu
    Tu Dinh Nguyen
    Hung Bui
    Dinh Phung
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2823 - 2829
  • [49] ADHD diagnosis from multiple data sources with batch effects
    Olivetti, Emanuele
    Greiner, Susanne
    Avesani, Paolo
    FRONTIERS IN SYSTEMS NEUROSCIENCE, 2012, 6
  • [50] Learning Conditional Latent Structures from Multiple Data Sources
    Viet Huynh
    Dinh Phung
    Long Nguyen
    Venkatesh, Svetha
    Bui, Hung H.
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART I, 2015, 9077 : 343 - 354