Domain Adaptation for Arabic Machine Translation: Financial Texts as a Case Study

被引:0
|
作者
Alghamdi, Emad A. [1 ,2 ]
Zakraoui, Jezia [2 ]
Abanmy, Fares A. [2 ]
机构
[1] King Abdulaziz Univ, Ctr Excellence AI & Data Sci, Jeddah 21589, Saudi Arabia
[2] ASAS AI Lab, Riyadh 13518, Saudi Arabia
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 16期
关键词
machine translation; Arabic MT; domain adaptation; financial domain;
D O I
10.3390/app14167088
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Neural machine translation (NMT) has shown impressive performance when trained on large-scale corpora. However, generic NMT systems have demonstrated poor performance on out-of-domain translation. To mitigate this issue, several domain adaptation methods have recently been proposed which often lead to better translation quality than genetic NMT systems. While there has been some continuous progress in NMT for English and other European languages, domain adaption in Arabic has received little attention in the literature. The current study, therefore, aims to explore the effectiveness of domain-specific adaptation for Arabic MT (AMT), in yet unexplored domain, financial news articles. To this end, we developed a parallel corpus for Arabic-English (AR-EN) translation in the financial domain to benchmark different domain adaptation methods. We then fine-tuned several pre-trained NMT and Large Language models including ChatGPT-3.5 Turbo on our dataset. The results showed that fine-tuning pre-trained NMT models on a few well-aligned in-domain AR-EN segments led to noticeable improvement. The quality of ChatGPT translation was superior to other models based on automatic and human evaluations. To the best of our knowledge, this is the first work on fine-tuning ChatGPT towards financial domain transfer learning. To contribute to research in domain translation, we made our datasets and fine-tuned models available.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] DaLC: Domain Adaptation Learning Curve Prediction for Neural Machine Translation
    Park, Cheonbok
    Kim, Hantae
    Calapodescu, Ioan
    Cho, Hyunchang
    Nikoulina, Vassilina
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1789 - 1807
  • [42] Improving Document-Level Neural Machine Translation with Domain Adaptation
    Ul Haq, Sami
    Rauf, Sadaf Abdul
    Shoukat, Arslan
    Noor-e-Hira
    NEURAL GENERATION AND TRANSLATION, 2020, : 225 - 231
  • [43] Non-Parametric Unsupervised Domain Adaptation for Neural Machine Translation
    Zheng, Xin
    Zhang, Zhirui
    Huang, Shujian
    Chen, Boxing
    Xie, Jun
    Luo, Weihua
    Chen, Jiajun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 4234 - 4241
  • [44] Overcoming Catastrophic Forgetting During Domain Adaptation of Neural Machine Translation
    Thompson, Brian
    Gwinnup, Jeremy
    Khayrallah, Huda
    Duh, Kevin
    Koehn, Philipp
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2062 - 2068
  • [45] Evaluating Arabic to English Machine Translation
    Hadla, Laith S.
    Hailat, Taghreed M.
    Al-Kabi, Mohammed N.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (11) : 68 - 73
  • [46] Machine translation for Arabic dialects (survey)
    Harrat, Salima
    Meftouh, Karima
    Smaili, Kamel
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (02) : 262 - 273
  • [47] Machine translation between Hebrew and Arabic
    Shilon, Reshef
    Habash, Nizar
    Lavie, Alon
    Wintner, Shuly
    MACHINE TRANSLATION, 2012, 26 (1-2) : 177 - 195
  • [48] Challenges in Machine Translation into Arabic Language
    Khan, Lubna Farah
    IJAZ ARABI JOURNAL OF ARABIC LEARNING, 2020, 3 (02):
  • [49] An English-Arabic Bi-directional Machine Translation Tool in the Agriculture Domain
    Shaalan, Khaled
    Hendam, Ashraf
    Rafea, Ahmed
    INTELLIGENT INFORMATION PROCESSING V, 2010, 340 : 281 - +
  • [50] A machine translation system from Arabic sign language to Arabic
    Luqman, Hamzah
    Mahmoud, Sabri A.
    UNIVERSAL ACCESS IN THE INFORMATION SOCIETY, 2020, 19 (04) : 891 - 904