A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model

被引：1

作者：

Um, Taehum ^{[1
]}

Kim, Namhyoung ^{[1
]}

机构：

[1] Gachon Univ, Dept Appl Stat, 1342 Seongnam Daero, Seongnam 13120, South Korea

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 17期

基金：

新加坡国家研究基金会;

关键词：

natural language processing; neural topic model; ELECTRA; ALBERT; multi-classification;

D O I：

10.3390/app14177898

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

As an extension of the transformer architecture, the BERT model has introduced a new paradigm for natural language processing, achieving impressive results in various downstream tasks. However, high-performance BERT-based models-such as ELECTRA, ALBERT, and RoBERTa-suffer from limitations such as poor continuous learning capability and insufficient understanding of domain-specific documents. To address these issues, we propose the use of an attention mechanism to combine BERT-based models with neural topic models. Unlike traditional stochastic topic modeling, neural topic modeling employs artificial neural networks to learn topic representations. Furthermore, neural topic models can be integrated with other neural models and trained to identify latent variables in documents, thereby enabling BERT-based models to sufficiently comprehend the contexts of specific fields. We conducted experiments on three datasets-Movie Review Dataset (MRD), 20Newsgroups, and YELP-to evaluate our model's performance. Compared to the vanilla model, the proposed model achieved an accuracy improvement of 1-2% for the ALBERT model in multiclassification tasks across all three datasets, while the ELECTRA model showed an accuracy improvement of less than 1%.

引用

页数：14

共 50 条

[11] TransPolymer: a Transformer-based language model for polymer property predictions
Xu, Changwen
Wang, Yuyang
Farimani, Amir Barati
NPJ COMPUTATIONAL MATERIALS, 2023, 9 (01)
[12] Transformer-Based Single-Cell Language Model: A Survey
Lan, Wei
He, Guohang
Liu, Mingyang
Chen, Qingfeng
Cao, Junyue
Peng, Wei
BIG DATA MINING AND ANALYTICS, 2024, 7 (04): : 1169 - 1186
[13] TransPolymer: a Transformer-based language model for polymer property predictions
Changwen Xu
Yuyang Wang
Amir Barati Farimani
npj Computational Materials, 9
[14] Generating Qualitative Descriptions of Diagrams with a Transformer-Based Language Model
Schorlemmer, Marco
Ballout, Mohamad
Kuehnberger, Kai-Uwe
DIAGRAMMATIC REPRESENTATION AND INFERENCE, DIAGRAMS 2024, 2024, 14981 : 61 - 75
[15] Deciphering "the language of nature": A transformer-based language model for deleterious mutations in proteins
Jiang, Theodore T.
Fang, Li
Wang, Kai
INNOVATION, 2023, 4 (05):
[16] Transformer-based Pouranic topic classification in Indian mythology
Paul, Apurba
Seal, Srijan
Das, Dipankar
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2024, 49 (04):
[17] A transformer-based deep neural network model for SSVEP classification
Chen, Jianbo
Zhang, Yangsong
Pan, Yudong
Xu, Peng
Guan, Cuntai
NEURAL NETWORKS, 2023, 164 : 521 - 534
[18] Transformer-Based Unified Neural Network for Quality Estimation and Transformer-Based Re-decoding Model for Machine Translation
Chen, Cong
Zong, Qinqin
Luo, Qi
Qiu, Bailian
Li, Maoxi
MACHINE TRANSLATION, CCMT 2020, 2020, 1328 : 66 - 75
[19] REDAffectiveLM: leveraging affect enriched embedding and transformer-based neural language model for readers' emotion detection
Kadan, Anoop
Deepak, P.
Gangan, Manjary P.
Abraham, Sam Savitha
Lajish, V. L.
KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (12) : 7495 - 7525
[20] Attention Calibration for Transformer-based Sequential Recommendation
Zhou, Peilin
Ye, Qichen
Xie, Yueqi
Gao, Jingqi
Wang, Shoujin
Kim, Jae Boum
You, Chenyu
Kim, Sunghun
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 3595 - 3605

← 1 2 3 4 5 →