The Task of Post-Editing Machine Translation for the Low-Resource Language

被引:4
|
作者
Rakhimova, Diana [1 ,2 ]
Karibayeva, Aidana [1 ,2 ]
Turarbek, Assem [1 ]
机构
[1] Al Farabi Kazakh Natl Univ, Dept Informat Syst, Alma Ata 050040, Kazakhstan
[2] Inst Informat & Comp Technol, Alma Ata 050010, Kazakhstan
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 02期
关键词
machine translation; post-editing machine translation; light post-editing; full post-editing; BRNN; transformer; English; Kazakh; Uzbek; Russian; HANDLING UNKNOWN WORDS; PRODUCT;
D O I
10.3390/app14020486
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In recent years, machine translation has made significant advancements; however, its effectiveness can vary widely depending on the language pair. Languages with limited resources, such as Kazakh, Uzbek, Kalmyk, Tatar, and others, often encounter challenges in achieving high-quality machine translations. Kazakh is an agglutinative language with complex morphology, making it a low-resource language. This article addresses the task of post-editing machine translation for the Kazakh language. The research begins by discussing the history and evolution of machine translation and how it has developed to meet the unique needs of languages with limited resources. The research resulted in the development of a machine translation post-editing system. The system utilizes modern machine learning methods, starting with neural machine translation using the BRNN model in the initial post-editing stage. Subsequently, the transformer model is applied to further edit the text. Complex structural and grammatical forms are processed, and abbreviations are replaced. Practical experiments were conducted on various texts: news publications, legislative documents, IT sphere, etc. This article serves as a valuable resource for researchers and practitioners in the field of machine translation, shedding light on effective post-editing strategies to enhance translation quality, particularly in scenarios involving languages with limited resources such as Kazakh and Uzbek. The obtained results were tested and evaluated using specialized metrics-BLEU, TER, and WER.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Maximum Entropy Model of Synonym Selection in Post-editing Machine Translation into Kazakh Language
    Shormakova, Assem
    Tukeyev, Ualsher
    ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2024, PT II, 2024, 2166 : 111 - 123
  • [22] Language Model Prior for Low-Resource Neural Machine Translation
    Baziotis, Christos
    Haddow, Barry
    Birch, Alexandra
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 7622 - 7634
  • [23] Automatic Machine Translation of Poetry and a Low-Resource Language Pair
    Dunder, I
    Seljan, S.
    Pavlovski, M.
    2020 43RD INTERNATIONAL CONVENTION ON INFORMATION, COMMUNICATION AND ELECTRONIC TECHNOLOGY (MIPRO 2020), 2020, : 1034 - 1039
  • [24] Post-editing neural machine translation versus translation memory segments
    Sanchez-Gijon, Pilar
    Moorkens, Joss
    Way, Andy
    MACHINE TRANSLATION, 2019, 33 (1-2) : 31 - 59
  • [25] The Trials and Tribulations of Predicting Machine Translation Post-Editing Productivity
    Marg, Lena
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 23 - 26
  • [26] Social groups in machine translation post-editing A SCOT analysis
    Sakamoto, Akiko
    Yamada, Masaru
    TRANSLATION SPACES, 2020, 9 (01) : 78 - 97
  • [27] Can Cognate Prediction Be Modelled as a Low-Resource Machine Translation Task?
    Fourrier, Clementine
    Bawden, Rachel
    Sagot, Benoit
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 847 - 861
  • [28] Multi-Modal Approaches for Post-Editing Machine Translation
    Herbig, Nico
    Pal, Santanu
    van Genabith, Josef
    Krueger, Antonio
    CHI 2019: PROCEEDINGS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, 2019,
  • [29] The impact of post-editing and machine translation on creativity and reading experience
    Guerberof-Arenas, Ana
    Toral, Antonio
    TRANSLATION SPACES, 2020, 9 (02) : 255 - 282
  • [30] Comparing the Quality of Neural Machine Translation and Professional Post-Editing
    Vardaro, Jennifer
    Schaeffer, Moritz
    Hansen-Schirra, Silvia
    2019 ELEVENTH INTERNATIONAL CONFERENCE ON QUALITY OF MULTIMEDIA EXPERIENCE (QOMEX), 2019,