Understanding political polarization using language models: A dataset and method

被引:3
|
作者
Gode, Samiran [1 ]
Bare, Supreeth [1 ]
Raj, Bhiksha [1 ,2 ]
Yoo, Hyungon [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Mohamed bin Zayed Univ Artificial Intelligence, Abu Dhabi, U Arab Emirates
关键词
Background information - Health care education - Information information - Informed decision - Language model - Model-based method - Political systems - Political views - Social issues - Wikipedia;
D O I
10.1002/aaai.12104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Our paper aims to analyze political polarization in US political system using language models, and thereby help candidates make an informed decision. The availability of this information will help voters understand their candidates' views on the economy, healthcare, education, and other social issues. Our main contributions are a dataset extracted from Wikipedia that spans the past 120 years and a language model-based method that helps analyze how polarized a candidate is. Our data are divided into two parts, background information and political information about a candidate, since our hypothesis is that the political views of a candidate should be based on reason and be independent of factors such as birthplace, alma mater, and so forth. We further split this data into four phases chronologically, to help understand if and how the polarization amongst candidates changes. This data has been cleaned to remove biases. To understand the polarization, we begin by showing results from some classical language models in Word2Vec and Doc2Vec. And then use more powerful techniques like the Longformer, a transformer-based encoder, to assimilate more information and find the nearest neighbors of each candidate based on their political view and their background. The code and data for the project will be available here: ""
引用
收藏
页码:248 / 254
页数:7
相关论文
共 50 条
  • [1] Understanding the Dataset Practitioners Behind Large Language Models
    Qian, Crystal
    Reif, Emily
    Kahng, Minsuk
    EXTENDED ABSTRACTS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2024, 2024,
  • [2] Language understanding using hidden understanding models
    Schwartz, R
    Miller, S
    Stallard, D
    Makhoul, J
    ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 997 - 1000
  • [3] Benchmarking Large Language Models on CFLUE - A Chinese Financial Language Understanding Evaluation Dataset
    Zhu, Jie
    Li, Junhui
    Wen, Yalong
    Guo, Lifan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 5673 - 5693
  • [4] VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models
    Li, Shicheng
    Li, Lei
    Liu, Yi
    Ren, Shuhuai
    Liu, Yuanxin
    Gao, Rundong
    Sun, Xu
    Hou, Lu
    COMPUTER VISION - ECCV 2024, PT LXX, 2025, 15128 : 331 - 348
  • [5] An Automatic Method for Understanding Political Polarization Through Social Media
    Zhang, Yihong
    Shirakawa, Masumi
    Hara, Takahiro
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2021, PT II, 2021, 12816 : 52 - 63
  • [6] Understanding models understanding language
    Sogaard, Anders
    SYNTHESE, 2022, 200 (06)
  • [7] Understanding models understanding language
    Anders Søgaard
    Synthese, 200
  • [8] Using Natural Sentences for Understanding Biases in Language Models
    Alnegheimish, Sarah
    Guo, Alicia
    Sun, Yi
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2824 - 2830
  • [9] Language understanding using n-multigram models
    Hurtado, L
    Segarra, E
    García, F
    Sanchis, E
    ADVANCES IN NATURAL LANGUAGE PROCESSING, 2004, 3230 : 207 - 219
  • [10] Understanding Social Reasoning in Language Models with Language Models
    Gandhi, Kanishk
    Franken, J. -Philipp
    Gerstenberg, Tobias
    Goodman, Noah D.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,