A Word Embedding Model Learned from Political Tweets

被引：0

作者：

Alnajran, Noufa N. ^{[1
]}

Crockett, Keeley A. ^{[1
]}

McLean, David ^{[1
]}

Latham, Annabel ^{[1
]}

机构：

[1] Manchester Metropolitan Univ, Dept Comp Math & Digital Technol, Manchester, Lancs, England

来源：

PROCEEDINGS OF 2018 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES) | 2018年

关键词：

Word Embedding; Language Modelling; Deep Learning; Social Network Analysis; Twitter Analysis;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Distributed word representations have recently contributed to significant improvements in many natural language processing (NLP) tasks. Distributional semantics have become amongst the important trends in machine learning (ML) applications. Word embeddings are distributed representations of words that learn semantic relationships from a large corpus of text. In the social context, the distributed representation of a word is likely to be different from general text word embeddings. This is relatively due to the unique lexical semantic features and morphological structure of social media text such as tweets, which implies different word vector representations. In this paper, we collect and present a political social dataset that consists of over four million English tweets. An artificial neural network (NN) is trained to learn word co -occurrence and generate word vectors from the political corpus of tweets. The model is 136MB and includes word representations for a vocabulary of over 86K unique words and phrases. The learned model shall contribute to the success of many ML and NLP applications in microblogging Social Network Analysis (OSN), such as semantic similarity and cluster analysis tasks.

引用

页码：177 / 183

页数：7

共 50 条

[1] ArWordVec: efficient word embedding models for Arabic tweets
Fouad, Mohammed M.
Mahany, Ahmed
Aljohani, Naif
Abbasi, Rabeeh Ayaz
Hassan, Saeed-Ul
SOFT COMPUTING, 2020, 24 (11) : 8061 - 8068
[2] ArWordVec: efficient word embedding models for Arabic tweets
Mohammed M. Fouad
Ahmed Mahany
Naif Aljohani
Rabeeh Ayaz Abbasi
Saeed-Ul Hassan
Soft Computing, 2020, 24 : 8061 - 8068
[3] Explainable Emotion Recognition from Tweets using Deep Learning and Word Embedding Models
Abubakar, Abdulqahar Mukhtar
Gupta, Deepa
Palaniswamy, Suja
2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
[4] An Embedding Model for Estimating Legislative Preferences from the Frequency and Sentiment of Tweets
Spell, Gregory P.
Guay, Brian
Hillygus, D. Sunshine
Carin, Lawrence
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 627 - 641
[5] COVID-19 Tweets Classification Based on a Hybrid Word Embedding Method
Didi, Yosra
Walha, Ahlam
Wali, Ali
BIG DATA AND COGNITIVE COMPUTING, 2022, 6 (02)
[6] Detecting Dengue/Flu Infections Based on Tweets Using LSTM and Word Embedding
Amin, Samina
Uddin, M. Irfan
Zeb, M. Ali
Alarood, Ala Abdulsalam
Mahmoud, Marwan
Alkinani, Monagi H.
IEEE ACCESS, 2020, 8 : 189054 - 189068
[7] Contextual Word Embedding: A Case Study in Clustering Tweets about Emergency Situations
Ganguly, Debasis
Ghosh, Kripabandhu
COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, : 73 - 74
[8] Political Ideology Prediction from Bengali Text Using Word Embedding Models
Tasnim, Zerin
Ahmed, Shuvo
Rahman, Atikur
Sorna, Jannatul Ferdous
Rahman, Mafizur
2021 INTERNATIONAL CONFERENCE ON EMERGING SMART COMPUTING AND INFORMATICS (ESCI), 2021, : 724 - 727
[9] Identifying tweets of personal health experience through word embedding and LSTM neural network
Keyuan Jiang
Shichao Feng
Qunhao Song
Ricardo A. Calix
Matrika Gupta
Gordon R. Bernard
BMC Bioinformatics, 19
[10] Identifying tweets of personal health experience through word embedding and LSTM neural network
Jiang, Keyuan
Feng, Shichao
Song, Qunhao
Calix, Ricardo A.
Gupta, Matrika
Bernard, Gordon R.
BMC BIOINFORMATICS, 2018, 19

← 1 2 3 4 5 →