Domain Adaptation for Arabic Machine Translation: Financial Texts as a Case Study

被引:0
|
作者
Alghamdi, Emad A. [1 ,2 ]
Zakraoui, Jezia [2 ]
Abanmy, Fares A. [2 ]
机构
[1] King Abdulaziz Univ, Ctr Excellence AI & Data Sci, Jeddah 21589, Saudi Arabia
[2] ASAS AI Lab, Riyadh 13518, Saudi Arabia
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 16期
关键词
machine translation; Arabic MT; domain adaptation; financial domain;
D O I
10.3390/app14167088
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Neural machine translation (NMT) has shown impressive performance when trained on large-scale corpora. However, generic NMT systems have demonstrated poor performance on out-of-domain translation. To mitigate this issue, several domain adaptation methods have recently been proposed which often lead to better translation quality than genetic NMT systems. While there has been some continuous progress in NMT for English and other European languages, domain adaption in Arabic has received little attention in the literature. The current study, therefore, aims to explore the effectiveness of domain-specific adaptation for Arabic MT (AMT), in yet unexplored domain, financial news articles. To this end, we developed a parallel corpus for Arabic-English (AR-EN) translation in the financial domain to benchmark different domain adaptation methods. We then fine-tuned several pre-trained NMT and Large Language models including ChatGPT-3.5 Turbo on our dataset. The results showed that fine-tuning pre-trained NMT models on a few well-aligned in-domain AR-EN segments led to noticeable improvement. The quality of ChatGPT translation was superior to other models based on automatic and human evaluations. To the best of our knowledge, this is the first work on fine-tuning ChatGPT towards financial domain transfer learning. To contribute to research in domain translation, we made our datasets and fine-tuned models available.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Domain adaptation of statistical machine translation with domain-focused web crawling
    Pecina, Pavel
    Toral, Antonio
    Papavassiliou, Vassilis
    Prokopidis, Prokopis
    Tamchyna, Ales
    Way, Andy
    van Genabith, Josef
    LANGUAGE RESOURCES AND EVALUATION, 2015, 49 (01) : 147 - 193
  • [22] Exploring Composite Indexes for Domain Adaptation in Neural Machine Translation
    Minh, Nhan Vo
    Minh, Khue Nguyen Tran
    Nguyen, Long H. B.
    Dinh, Dien
    VIETNAM JOURNAL OF COMPUTER SCIENCE, 2024, 11 (01) : 75 - 94
  • [23] ASR ERROR CORRECTION AND DOMAIN ADAPTATION USING MACHINE TRANSLATION
    Mani, Anirudh
    Palaskar, Shruti
    Meripo, Nimshi Venkat
    Konam, Sandeep
    Metze, Florian
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6344 - 6348
  • [24] An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation
    Chu, Chenhui
    Dabre, Raj
    Kurohashi, Sadao
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, : 385 - 391
  • [25] Adaptation of machine translation for multilingual information retrieval in the medical domain
    Pecina, Pavel
    Dusek, Ondrej
    Goeuriot, Lorraine
    Hajic, Jan
    Hlavacova, Jaroslava
    Jones, Gareth J. F.
    Kelly, Liadh
    Leveling, Johannes
    Marecek, David
    Novak, Michal
    Popel, Martin
    Rosa, Rudolf
    Tamchyna, Ales
    Uresova, Zdenka
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2014, 61 (03) : 165 - 185
  • [26] Exploring iterative dual domain adaptation for neural machine translation
    Liu, Xin
    Zeng, Jieli
    Wang, Xiaoyue
    Wang, Zhihao
    Su, Jinsong
    KNOWLEDGE-BASED SYSTEMS, 2024, 283
  • [27] Arabic machine translation: a survey
    Arwa Alqudsi
    Nazlia Omar
    Khalid Shaker
    Artificial Intelligence Review, 2014, 42 : 549 - 572
  • [28] Iterative Nearest Neighbour Machine Translation for Unsupervised Domain Adaptation
    Huang, Hui
    Wu, Shuangzhi
    Liang, Xinnian
    Zhou, Zefan
    Yang, Muyun
    Zhao, Tiejun
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 13294 - 13301
  • [29] Sentence Selection and Weighting for Neural Machine Translation Domain Adaptation
    Wang, Rui
    Utiyama, Masao
    Finch, Andrew
    Liu, Lemao
    Chen, Kehai
    Sumita, Eiichiro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (10) : 1727 - 1741
  • [30] Domain adaptation strategies in statistical machine translation: a brief overview
    Costa-Jussa, Marta R.
    KNOWLEDGE ENGINEERING REVIEW, 2015, 30 (05): : 514 - 520