A Word Embedding Model Learned from Political Tweets

被引:0
|
作者
Alnajran, Noufa N. [1 ]
Crockett, Keeley A. [1 ]
McLean, David [1 ]
Latham, Annabel [1 ]
机构
[1] Manchester Metropolitan Univ, Dept Comp Math & Digital Technol, Manchester, Lancs, England
关键词
Word Embedding; Language Modelling; Deep Learning; Social Network Analysis; Twitter Analysis;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Distributed word representations have recently contributed to significant improvements in many natural language processing (NLP) tasks. Distributional semantics have become amongst the important trends in machine learning (ML) applications. Word embeddings are distributed representations of words that learn semantic relationships from a large corpus of text. In the social context, the distributed representation of a word is likely to be different from general text word embeddings. This is relatively due to the unique lexical semantic features and morphological structure of social media text such as tweets, which implies different word vector representations. In this paper, we collect and present a political social dataset that consists of over four million English tweets. An artificial neural network (NN) is trained to learn word co -occurrence and generate word vectors from the political corpus of tweets. The model is 136MB and includes word representations for a vocabulary of over 86K unique words and phrases. The learned model shall contribute to the success of many ML and NLP applications in microblogging Social Network Analysis (OSN), such as semantic similarity and cluster analysis tasks.
引用
收藏
页码:177 / 183
页数:7
相关论文
共 50 条
  • [41] Learning multi-prototype word embedding from single-prototype word embedding with integrated knowledge
    Yang, Xuefeng
    Mao, Kezhi
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 56 : 291 - 299
  • [42] TransPhrase: A new method for generating phrase embedding from word embedding in Chinese
    Li, Rongsheng
    Huang, Shaobin
    Mao, Xiangke
    He, Jie
    Shen, Linshan
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 168
  • [43] A Spanish Political Tweets Fine-Tuned Sentiment Analysis Model
    Jimenez-Bravo, Diego M.
    Lozano Murciego, Alvaro
    Bajo, Javier
    De La Iglesia, Daniel H.
    Pinzon, Cristian
    NEW TRENDS IN DISRUPTIVE TECHNOLOGIES, TECH ETHICS AND ARTIFICIAL INTELLIGENCE, DITTET 2022, 2023, 1430 : 91 - 102
  • [44] A comparative study on word embedding techniques for suicide prediction on COVID-19 tweets using deep learning models
    Kancharapu R.
    A Ayyagari S.N.
    International Journal of Information Technology, 2023, 15 (6) : 3293 - 3306
  • [45] Determining the function of political tweets
    Sang, Erik Tjong Kim
    Kruitbosch, Herbert
    Broersma, Marcel
    del Valle, Marc Esteve
    2017 IEEE 13TH INTERNATIONAL CONFERENCE ON E-SCIENCE (E-SCIENCE), 2017, : 438 - 439
  • [46] Sentiment Analysis of Political Tweets From the 2019 Spanish Elections
    Rodriguez-Ibanez, Margarita
    Gimeno-Blanes, Francisco-Javier
    Cuenca-Jimenez, Pedro Manuel
    Soguero-Ruiz, Cristina
    Rojo-Alvarez, Jose Luis
    IEEE ACCESS, 2021, 9 : 101847 - 101862
  • [47] Requiem for Online Harassers: Identifying Racism from Political Tweets
    Lozano, Estefania
    Cedeno, Jorge
    Castillo, Galo
    Layedra, Fabricio
    Lasso, Henry
    Vaca, Carmen
    2017 FOURTH INTERNATIONAL CONFERENCE ON EDEMOCRACY & EGOVERNMENT (ICEDEG), 2017, : 154 - 160
  • [48] Embedding, quoting, or paraphrasing? Investigating the effects of political leaders' tweets in online news articles: The case of Donald Trump
    Dumitrescu, Delia
    Ross, Andrew R. N.
    NEW MEDIA & SOCIETY, 2021, 23 (08) : 2279 - 2302
  • [49] Word Embedding Based Retrieval Model for Similar Cases Recommendation
    Zhao, Yifei
    Wang, Jing
    Wang, Feiyue
    2015 CHINESE AUTOMATION CONGRESS (CAC), 2015, : 2268 - 2272
  • [50] WELMSD - word embedding and language model based sarcasm detection
    Kumar, Pradeep
    Sarin, Gaurav
    ONLINE INFORMATION REVIEW, 2022, 46 (07) : 1242 - 1256