A self-attention hybrid emoji prediction model for code-mixed language: (Hinglish)

被引：0

作者：

Gadde Satya Sai Naga Himabindu

Rajat Rao

Divyashikha Sethia

机构：

[1] Delhi Technological University,Department of Computer Engineering

来源：

Social Network Analysis and Mining | 2022年 / 12卷

关键词：

Emoji prediction; Hinglish; Code mixed; Deep learning; Hybrid model;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Emojis are an essential tool for communication, and various resource-rich languages such as English use emoji prediction systems. However, there is limited research on emoji prediction for resource-poor and code-mixed languages such as Hinglish (Hindi + English), the fourth most used code-mixed language globally. This paper proposes a novel Hinglish Emoji Prediction (HEP) dataset created using Twitter as a corpus and a hybrid emoji prediction model BiLSTM attention random forest (BARF) for code-mixed Hinglish language. The proposed BARF model combines deep learning features with machine learning classification. It begins with BiLSTM to capture the context and then proceeds to self-attention to extract significant texts. Finally, it uses random forest to categorize the features to predict an emoji. The self-attention mechanism aids learning since Hinglish, a code-mixed language, lacks proper grammatical rules. The combination of deep learning and machine learning algorithms and attention is novel to emoji prediction in the code-mixed language(Hinglish). Results on the HEP dataset indicate that the BARF model outperformed previous multilingual and baseline emoji prediction models. It achieved an accuracy of 61.14%, precision of 0.66, recall of 0.59, and F1 score of 0.59.

引用

共 50 条

[41] Multilayer self-attention residual network for code search
Hu, Haize
Liu, Jianxun
Zhang, Xiangping
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (09):
[42] A PM2.5 spatiotemporal prediction model based on mixed graph convolutional GRU and self-attention network
Zhao, Guyu
Yang, Xiaoyuan
Shi, Jiansen
He, Hongdou
Wang, Qian
ENVIRONMENTAL POLLUTION, 2025, 368
[43] Meta-Learning for Offensive Language Detection in Code-Mixed Texts
Suresh, Gautham Vadakkekara
Chakravarthi, Bharathi Raja
McCrae, John P.
FIRE 2021: PROCEEDINGS OF THE 13TH ANNUAL MEETING OF THE FORUM FOR INFORMATION RETRIEVAL EVALUATION, 2021, : 58 - 66
[44] Vehicle Interaction Behavior Prediction with Self-Attention
Li, Linhui
Sui, Xin
Lian, Jing
Yu, Fengning
Zhou, Yafu
SENSORS, 2022, 22 (02)
[45] Mechanics of Next Token Prediction with Self-Attention
Li, Yingcong
Huang, Yixiao
Ildiz, M. Emrullah
Rawat, Ankit Singh
Oymak, Samet
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[46] Augmenting sentiment prediction capabilities for code-mixed tweets with multilingual transformers
Hashmi, Ehtesham
Yayilgan, Sule Yildirim
Shaikh, Sarang
SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
[47] Modality attention fusion model with hybrid multi-head self-attention for video understanding
Zhuang, Xuqiang
Liu, Fang'al
Hou, Jian
Hao, Jianhua
Cai, Xiaohong
PLOS ONE, 2022, 17 (10):
[48] Pre-trained language model for code-mixed text in Indonesian, Javanese, and English using transformer
Ahmad Fathan Hidayatullah
Rosyzie Anna Apong
Daphne Teck Ching Lai
Atika Qazi
Social Network Analysis and Mining, 15 (1)
[49] CYBERNETIC MODEL OF SELF-ATTENTION PROCESSES
CARVER, CS
JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1979, 37 (08) : 1251 - 1281
[50] A self-attention sequential model for long-term prediction of video streams
Ge, Yunfeng
Li, Hongyan
Shi, Keyi
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2024, 51 (03): : 88 - 102

← 1 2 3 4 5 →