Multilingual Controllable Transformer-Based Lexical Simplification

被引:0
|
作者
Sheang, Kim Cheng [1 ]
Saggion, Horacio [1 ]
机构
[1] Univ Pompeu Fabra, LaSTUS Grp, TALN Lab, DTIC, Barcelona, Spain
来源
关键词
Multilingual Lexical Simplification; Controllable Lexical Simplification; Text Simplification; Multilinguality;
D O I
10.26342/2023-71-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text is by far the most ubiquitous source of knowledge and information and should be made easily accessible to as many people as possible; however, texts often contain complex words that hinder reading comprehension and accessibility. Therefore, suggesting simpler alternatives for complex words without compromising meaning would help convey the information to a broader audience. This paper proposes mTLS, a multilingual controllable Transformer-based Lexical Simplification (LS) system fined-tuned with the T5 model. The novelty of this work lies in the use of language-specific prefixes, control tokens, and candidates extracted from pretrained masked language models to learn simpler alternatives for complex words. The evaluation results on three well-known LS datasets - LexMTurk, BenchLS, and NNSEval - show that our model outperforms the previous state-of-the-art models like LSBert and ConLS. Moreover, further evaluation of our approach on the part of the recent TSAR-2022 multilingual LS shared-task dataset shows that our model performs competitively when compared with the participating systems for English LS and even outperforms the GPT-3 model on several metrics. Moreover, our model obtains performance gains also for Spanish and Portuguese.
引用
收藏
页码:109 / 123
页数:15
相关论文
共 50 条
  • [1] Simplification of Arabic text: A hybrid approach integrating machine translation and transformer-based lexical model
    Al-Thanyyan, Suha S.
    Azmi, Aqil M.
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (08)
  • [2] Multilingual Transformer-Based Personality Traits Estimation
    Leonardi, Simone
    Monti, Diego
    Rizzo, Giuseppe
    Morisio, Maurizio
    INFORMATION, 2020, 11 (04)
  • [3] Practical Transformer-based Multilingual Text Classification
    Wang, Cindy
    Banko, Michele
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, NAACL-HLT 2021, 2021, : 121 - 129
  • [4] Transformer-based approach for symptom recognition and multilingual linking
    Vassileva, Sylvia
    Grazhdanski, Georgi
    Koychev, Ivan
    Boytcheva, Svetla
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2024, 2024
  • [5] Assessing the Syntactic Capabilities of Transformer-based Multilingual Language Models
    Perez-Mayos, Laura
    Taboas Garcia, Alba
    Mille, Simon
    Wanner, Leo
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3799 - 3812
  • [6] Controllable Text Simplification with Lexical Constraint Loss
    Nishihara, Daiki
    Kajiwara, Tomoyuki
    Arase, Yuki
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 260 - 266
  • [7] ALSI-Transformer: Transformer-Based Code Comment Generation With Aligned Lexical and Syntactic Information
    Park, Youngmi
    Park, Ahjeong
    Kim, Chulyun
    IEEE ACCESS, 2023, 11 : 39037 - 39047
  • [8] A Controllable Distributed Energy Resource Transformer-Based Grounding Scheme for Microgrids
    Li, Dingrui
    Ma, Yiwei
    Su, Yu
    Zhang, Chengwen
    Zhu, Lin
    Yin, He
    Wang, Fred
    Tolbert, Leon M.
    IEEE OPEN ACCESS JOURNAL OF POWER AND ENERGY, 2024, 11 : 165 - 177
  • [9] A Lexical-aware Non-autoregressive Transformer-based ASR Model
    Lin, Chong-En
    Chen, Kuan-Yu
    INTERSPEECH 2023, 2023, : 1434 - 1438
  • [10] Transforming Term Extraction: Transformer-Based Approaches to Multilingual Term Extraction Across Domains
    Lang, Christian
    Wachowiak, Lennart
    Heinisch, Barbara
    Gromann, Dagmar
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3607 - 3620