A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model

被引:1
|
作者
Um, Taehum [1 ]
Kim, Namhyoung [1 ]
机构
[1] Gachon Univ, Dept Appl Stat, 1342 Seongnam Daero, Seongnam 13120, South Korea
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 17期
基金
新加坡国家研究基金会;
关键词
natural language processing; neural topic model; ELECTRA; ALBERT; multi-classification;
D O I
10.3390/app14177898
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
As an extension of the transformer architecture, the BERT model has introduced a new paradigm for natural language processing, achieving impressive results in various downstream tasks. However, high-performance BERT-based models-such as ELECTRA, ALBERT, and RoBERTa-suffer from limitations such as poor continuous learning capability and insufficient understanding of domain-specific documents. To address these issues, we propose the use of an attention mechanism to combine BERT-based models with neural topic models. Unlike traditional stochastic topic modeling, neural topic modeling employs artificial neural networks to learn topic representations. Furthermore, neural topic models can be integrated with other neural models and trained to identify latent variables in documents, thereby enabling BERT-based models to sufficiently comprehend the contexts of specific fields. We conducted experiments on three datasets-Movie Review Dataset (MRD), 20Newsgroups, and YELP-to evaluate our model's performance. Compared to the vanilla model, the proposed model achieved an accuracy improvement of 1-2% for the ALBERT model in multiclassification tasks across all three datasets, while the ELECTRA model showed an accuracy improvement of less than 1%.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] PETR: Rethinking the Capability of Transformer-Based Language Model in Scene Text Recognition
    Wang, Yuxin
    Xie, Hongtao
    Fang, Shancheng
    Xing, Mengting
    Wang, Jing
    Zhu, Shenggao
    Zhang, Yongdong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5585 - 5598
  • [42] Java']JavaBERT: Training a transformer-based model for the Java']Java programming language
    De Sousa, Nelson Tavares
    Hasselbring, Wilhelm
    2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING WORKSHOPS (ASEW 2021), 2021, : 90 - 95
  • [43] Automatic assessment of divergent thinking in Chinese language with TransDis: A transformer-based language model approach
    Yang, Tianchen
    Zhang, Qifan
    Sun, Zhaoyang
    Hou, Yubo
    BEHAVIOR RESEARCH METHODS, 2024, 56 (06) : 5798 - 5819
  • [44] High entropy alloy property predictions using a transformer-based language model
    Spyros Kamnis
    Konstantinos Delibasis
    Scientific Reports, 15 (1)
  • [45] DeePathNet: A Transformer-Based Deep Learning Model Integrating Multiomic Data with Cancer Pathways
    Cai, Zhaoxiang
    Poulos, Rebecca C.
    Aref, Adel
    Robinson, Phillip J.
    Reddel, Roger R.
    Zhong, Qing
    CANCER RESEARCH COMMUNICATIONS, 2024, 4 (12): : 3151 - 3164
  • [46] Topic Compositional Neural Language Model
    Wang, Wenlin
    Gan, Zhe
    Wang, Wenqi
    Shen, Dinghan
    Huang, Jiaji
    Ping, Wei
    Satheesh, Sanjeev
    Carin, Lawrence
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [47] High performance binding affinity prediction with a Transformer-based surrogate model
    Vasan, Archit
    Gokdemir, Ozan
    Brace, Alexander
    Ramanathan, Arvind
    Brettin, Thomas
    Stevens, Rick
    Vishwanath, Venkatram
    2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW 2024, 2024, : 571 - 580
  • [48] A transformer-Based neural language model that synthesizes brain activation maps from free-form text queries
    Ngo, Gia H.
    Nguyen, Minh
    Chen, Nancy F.
    Sabuncu, Mert R.
    MEDICAL IMAGE ANALYSIS, 2022, 81
  • [49] Classification of hyperspectral and LiDAR data by transformer-based enhancement
    Pan, Jiechen
    Shuai, Xing
    Xu, Qing
    Dai, Mofan
    Zhang, Guoping
    Wang, Guo
    REMOTE SENSING LETTERS, 2024, 15 (10) : 1074 - 1084
  • [50] An Exploration of Length Generalization in Transformer-Based Speech Enhancement
    Zhang, Qiquan
    Zhu, Hongxu
    Qian, Xinyuan
    Ambikairajah, Eliathamby
    Li, Haizhou
    INTERSPEECH 2024, 2024, : 1725 - 1729