A Word Embedding Model Learned from Political Tweets

被引：0

作者：

Alnajran, Noufa N. ^{[1
]}

Crockett, Keeley A. ^{[1
]}

McLean, David ^{[1
]}

Latham, Annabel ^{[1
]}

机构：

[1] Manchester Metropolitan Univ, Dept Comp Math & Digital Technol, Manchester, Lancs, England

来源：

PROCEEDINGS OF 2018 13TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND SYSTEMS (ICCES) | 2018年

关键词：

Word Embedding; Language Modelling; Deep Learning; Social Network Analysis; Twitter Analysis;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Distributed word representations have recently contributed to significant improvements in many natural language processing (NLP) tasks. Distributional semantics have become amongst the important trends in machine learning (ML) applications. Word embeddings are distributed representations of words that learn semantic relationships from a large corpus of text. In the social context, the distributed representation of a word is likely to be different from general text word embeddings. This is relatively due to the unique lexical semantic features and morphological structure of social media text such as tweets, which implies different word vector representations. In this paper, we collect and present a political social dataset that consists of over four million English tweets. An artificial neural network (NN) is trained to learn word co -occurrence and generate word vectors from the political corpus of tweets. The model is 136MB and includes word representations for a vocabulary of over 86K unique words and phrases. The learned model shall contribute to the success of many ML and NLP applications in microblogging Social Network Analysis (OSN), such as semantic similarity and cluster analysis tasks.

引用

页码：177 / 183

页数：7

共 50 条

[21] Bisociative Literature-Based Discovery: Lessons Learned and New Word Embedding Approach
Lavrac, Nada
Martinc, Matej
Pollak, Senja
Novak, Marusa Pompe
Cestnik, Bojan
NEW GENERATION COMPUTING, 2020, 38 (04) : 773 - 800
[22] Effective and scalable legal judgment recommendation using pre-learned word embedding
Jenish Dhanani
Rupa Mehta
Dipti Rana
Complex & Intelligent Systems, 2022, 8 : 3199 - 3213
[23] Court Similar Case Recommendation Model Based on Word Embedding and Word Frequency
Yang, Fan
Chen, Jianxia
Huang, Yujun
Li, Chao
2020 12TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2020, : 165 - 170
[24] Effective and scalable legal judgment recommendation using pre-learned word embedding
Dhanani, Jenish
Mehta, Rupa
Rana, Dipti
COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (04) : 3199 - 3213
[25] Short Text Clustering based on Word Semantic Graph with Word Embedding Model
Jinarat, Supakpong
Manaskasemsak, Bundit
Rungsawang, Arnon
2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 1427 - 1432
[26] A Custom Word Embedding Model for Clustering of Maintenance Records
Bhardwaj, Abhijeet Sandeep
Deep, Akash
Veeramani, Dharmaraj
Zhou, Shiyu
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (02) : 816 - 826
[27] Classification of Obsessive-Compulsive Disorder Symptoms in Arabic Tweets Using Machine Learning and Word Embedding Techniques
Al-Haider, Malak Fahad
Qamar, Ali Mustafa
Alkahtani, Hasan Shojaa
Ahmad, Hafiz Farooq
JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2024, 15 (07) : 798 - 811
[28] CONVERGENCE OF THE PARTITION FUNCTION IN THE STATIC WORD EMBEDDING MODEL
Mynbaev, Kairat
Assylbekov, Zhenisbek
EURASIAN MATHEMATICAL JOURNAL, 2022, 13 (04): : 70 - 81
[29] An Improved Embedding Matching Model for Chinese Word Segmentation
Deng, Xiaolong
Sun, Yingfei
2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD), 2018, : 195 - 200
[30] Emotional Similarity Word Embedding Model for Sentiment Analysis
Matsumoto, Kazuyuki
Matsunaga, Takumi
Yoshida, Minoru
Kita, Kenji
COMPUTACION Y SISTEMAS, 2022, 26 (02): : 875 - 886

← 1 2 3 4 5 →