Performance Evaluation of Pre-trained Models in Sarcasm Detection Task

被引：2

作者：

Wang, Haiyang ^{[1
]}

Song, Xin ^{[1
]}

Zhou, Bin ^{[1
]}

Wang, Ye ^{[1
]}

Gao, Liqun ^{[1
]}

Jia, Yan ^{[1
]}

机构：

[1] Natl Univ Def Technol, Changsha, Peoples R China

来源：

WEB INFORMATION SYSTEMS ENGINEERING - WISE 2021, PT II | 2021年 / 13081卷

关键词：

Sarcasm detection; Pre-trained models; Natural language processing;

D O I：

10.1007/978-3-030-91560-5_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Sarcasm is a widespread phenomenon in social media such as Twitter or Instagram. As a critical task of Natural Language Processing (NLP), sarcasm detection plays an important role in many domains of semantic analysis, such as stance detection and sentiment analysis. Recently, pre-trained models (PTMs) on large unlabelled corpora have shown excellent performance in various tasks of NLP. PTMs have learned universal language representations and can help researchers avoid training a model from scratch. The goal of our paper is to evaluate the performance of various PTMs in the sarcasm detection task. We evaluate and analyse the performance of several representative PTMs on four well-known sarcasm detection datasets. The experimental results indicate that RoBERTa outperforms other PTMs and it is also better than the best baseline in three datasets. DistilBERT is the best choice for sarcasm detection task when computing resources are limited. However, XLNet may not be suitable for sarcasm detection task. In addition, we implement detailed grid search for four hyperparameters to investigate their impact on PTMs. The results show that learning rate is the most important hyperparameter. Furthermore, we also conduct error analysis by means of several sarcastic sentences to explore the reasons of detection failures, which provides instructive ideas for future research.

引用

页码：67 / 75

页数：9

共 50 条

[41] HinPLMs: Pre-trained Language Models for Hindi
Huang, Xixuan
Lin, Nankai
Li, Kexin
Wang, Lianxi
Gan, Suifu
2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 241 - 246
[42] Deep Compression of Pre-trained Transformer Models
Wang, Naigang
Liu, Chi-Chun
Venkataramani, Swagath
Sen, Sanchari
Chen, Chia-Yu
El Maghraoui, Kaoutar
Srinivasan, Vijayalakshmi
Chang, Leland
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[43] Evaluating Commonsense in Pre-Trained Language Models
Zhou, Xuhui
Zhang, Yue
Cui, Leyang
Huang, Dandan
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 9733 - 9740
[44] Semantic Programming by Example with Pre-trained Models
Verbruggen, Gust
Le, Vu
Gulwani, Sumit
PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2021, 5 (OOPSLA):
[45] Aliasing Backdoor Attacks on Pre-trained Models
Wei, Cheng'an
Lee, Yeonjoon
Chen, Kai
Meng, Guozhu
Lv, Peizhuo
PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 2707 - 2724
[46] Knowledge Inheritance for Pre-trained Language Models
Qin, Yujia
Lin, Yankai
Yi, Jing
Zhang, Jiajie
Han, Xu
Zhang, Zhengyan
Su, Yusheng
Liu, Zhiyuan
Li, Peng
Sun, Maosong
Zhou, Jie
NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3921 - 3937
[47] Continual Learning with Pre-Trained Models: A Survey
Zhou, Da-Wei
Sun, Hai-Long
Ning, Jingyi
Ye, Han-Jia
Zhan, De-Chuan
PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 8363 - 8371
[48] Code Execution with Pre-trained Language Models
Liu, Chenxiao
Lu, Shuai
Chen, Weizhu
Jiang, Daxin
Svyatkovskiy, Alexey
Fu, Shengyu
Sundaresan, Neel
Duan, Nan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 4984 - 4999
[49] Are Pre-trained Convolutions Better than Pre-trained Transformers?
Tay, Yi
Dehghani, Mostafa
Gupta, Jai
Aribandi, Vamsi
Bahri, Dara
Qin, Zhen
Metzler, Donald
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4349 - 4359
[50] Probing for Hyperbole in Pre-Trained Language Models
Schneidermann, Nina Skovgaard
Hershcovich, Daniel
Pedersen, Bolette Sandford
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-SRW 2023, VOL 4, 2023, : 200 - 211

← 1 2 3 4 5 →