On the Poor Robustness of Transformer Models in Cross-Language Humor Recognition

被引：0

作者：

Tamayo, Roberto Labadie ^{[1
]}

Ortega-Bueno, Reynier ^{[1
]}

Rosso, Paolo ^{[1
]}

Cisneros, Mariano Rodriguez ^{[2
]}

机构：

[1] Univ Politen Valencia, Valencia, Spain

[2] Harbour Space Univ, Barcelona, Barcelona, Spain

来源：

PROCESAMIENTO DEL LENGUAJE NATURAL | 2023年 / 70期

关键词：

humor recognition; humor translation; cross-language humor; multilingual models; JOKES;

D O I：

10.26342/2023-70-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Humor is a pervasive communicative device; nevertheless, its portability from one language to another remains challenging for computer machines and even humans. In this work, we investigate the problem of humor recognition from a cross -language and cross-domain perspective, focusing on English and Spanish languages. To this aim, we rely on two strategies: the first is based on multilingual transformer models for exploiting the cross-language knowledge distilled by them, and the second introduces machine translation to learn and make predictions in a single language. Experiments showed that models struggle in front of the humor complexity when it is translated, effectively tracking a degradation in the humor perception when mes-sages flow from one language to another. However, when multilingual models face a cross-language scenario, exclusive between the fine-tuning and evaluation data lan-guages, humor translation helps to align the knowledge learned in fine-tuning phase. According to this, a mean increase of 11% in F1 score was observed when classi-fying English-written texts with models fine-tuned with a Spanish dataset. These results are encouraging and constitute the first step towards a computationally cross -language analysis of humor.

引用

页码：73 / 83

页数：11

共 50 条

[41] Cross-Language Retrieval with Wikipedia
Schoenhofen, Peter
Benczur, Andras
Biro, Istvan
Csalogany, Karoly
ADVANCES IN MULTILINGUAL AND MULTIMODAL INFORMATION RETRIEVAL, 2008, 5152 : 72 - 79
[42] Cross-Language Information Retrieval
Federico, Marcello
COMPUTATIONAL LINGUISTICS, 2011, 37 (02) : 411 - 412
[43] Cross-language information retrieval
Oard, DW
Diekema, AR
ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY, 1998, 33 : 223 - 256
[44] Identifying Styles of Cross-Language Classics with Pre-Trained Models
Zhang Y.
Deng S.
Hu H.
Wang D.
Data Analysis and Knowledge Discovery, 2023, 7 (10) : 50 - 62
[45] Combining Wikipedia-Based Concept Models for Cross-Language Retrieval
Roth, Benjamin
Klakow, Dietrich
ADVANCES IN MULTIDISCIPLINARY RETRIEVAL, 2010, 6107 : 47 - 59
[46] Transfer Learning Approaches for Building Cross-Language Dense Retrieval Models
Nair, Suraj
Yang, Eugene
Lawrie, Dawn
Duh, Kevin
McNamee, Paul
Murray, Kenton
Mayfield, James
Oard, Douglas W.
ADVANCES IN INFORMATION RETRIEVAL, PT I, 2022, 13185 : 382 - 396
[47] Explicit Versus Latent Concept Models for Cross-Language Information Retrieval
Cimiano, Philipp
Schultz, Antje
Sizov, Sergej
Sorg, Philipp
Staab, Steffen
21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1513 - 1518
[48] Cross-language effects of phonological and orthographic similarity in cognate word recognition The role of language dominance
Carrasco-Ortiz, Haydee
Amengual, Mark
Gries, Stefan Th
LINGUISTIC APPROACHES TO BILINGUALISM, 2021, 11 (03) : 389 - 417
[49] Recognition of Cross-Language Acoustic Emotional Valence Using Stacked Ensemble Learning
Zvarevashe, Kudakwashe
Olugbara, Oludayo O.
ALGORITHMS, 2020, 13 (10)
[50] Chinese-English bilingual phone modeling for cross-language speech recognition
Yu, SM
Zhang, SW
Xu, B
2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 917 - 920

← 1 2 3 4 5 →