A self-attention hybrid emoji prediction model for code-mixed language: (Hinglish)

被引:0
|
作者
Gadde Satya Sai Naga Himabindu
Rajat Rao
Divyashikha Sethia
机构
[1] Delhi Technological University,Department of Computer Engineering
来源
关键词
Emoji prediction; Hinglish; Code mixed; Deep learning; Hybrid model;
D O I
暂无
中图分类号
学科分类号
摘要
Emojis are an essential tool for communication, and various resource-rich languages such as English use emoji prediction systems. However, there is limited research on emoji prediction for resource-poor and code-mixed languages such as Hinglish (Hindi + English), the fourth most used code-mixed language globally. This paper proposes a novel Hinglish Emoji Prediction (HEP) dataset created using Twitter as a corpus and a hybrid emoji prediction model BiLSTM attention random forest (BARF) for code-mixed Hinglish language. The proposed BARF model combines deep learning features with machine learning classification. It begins with BiLSTM to capture the context and then proceeds to self-attention to extract significant texts. Finally, it uses random forest to categorize the features to predict an emoji. The self-attention mechanism aids learning since Hinglish, a code-mixed language, lacks proper grammatical rules. The combination of deep learning and machine learning algorithms and attention is novel to emoji prediction in the code-mixed language(Hinglish). Results on the HEP dataset indicate that the BARF model outperformed previous multilingual and baseline emoji prediction models. It achieved an accuracy of 61.14%, precision of 0.66, recall of 0.59, and F1 score of 0.59.
引用
收藏
相关论文
共 50 条
  • [41] Multilayer self-attention residual network for code search
    Hu, Haize
    Liu, Jianxun
    Zhang, Xiangping
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (09):
  • [42] A PM2.5 spatiotemporal prediction model based on mixed graph convolutional GRU and self-attention network
    Zhao, Guyu
    Yang, Xiaoyuan
    Shi, Jiansen
    He, Hongdou
    Wang, Qian
    ENVIRONMENTAL POLLUTION, 2025, 368
  • [43] Meta-Learning for Offensive Language Detection in Code-Mixed Texts
    Suresh, Gautham Vadakkekara
    Chakravarthi, Bharathi Raja
    McCrae, John P.
    FIRE 2021: PROCEEDINGS OF THE 13TH ANNUAL MEETING OF THE FORUM FOR INFORMATION RETRIEVAL EVALUATION, 2021, : 58 - 66
  • [44] Vehicle Interaction Behavior Prediction with Self-Attention
    Li, Linhui
    Sui, Xin
    Lian, Jing
    Yu, Fengning
    Zhou, Yafu
    SENSORS, 2022, 22 (02)
  • [45] Mechanics of Next Token Prediction with Self-Attention
    Li, Yingcong
    Huang, Yixiao
    Ildiz, M. Emrullah
    Rawat, Ankit Singh
    Oymak, Samet
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [46] Augmenting sentiment prediction capabilities for code-mixed tweets with multilingual transformers
    Hashmi, Ehtesham
    Yayilgan, Sule Yildirim
    Shaikh, Sarang
    SOCIAL NETWORK ANALYSIS AND MINING, 2024, 14 (01)
  • [47] Modality attention fusion model with hybrid multi-head self-attention for video understanding
    Zhuang, Xuqiang
    Liu, Fang'al
    Hou, Jian
    Hao, Jianhua
    Cai, Xiaohong
    PLOS ONE, 2022, 17 (10):
  • [48] Pre-trained language model for code-mixed text in Indonesian, Javanese, and English using transformer
    Ahmad Fathan Hidayatullah
    Rosyzie Anna Apong
    Daphne Teck Ching Lai
    Atika Qazi
    Social Network Analysis and Mining, 15 (1)
  • [49] CYBERNETIC MODEL OF SELF-ATTENTION PROCESSES
    CARVER, CS
    JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1979, 37 (08) : 1251 - 1281
  • [50] A self-attention sequential model for long-term prediction of video streams
    Ge, Yunfeng
    Li, Hongyan
    Shi, Keyi
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2024, 51 (03): : 88 - 102