Named Entity Recognition;
Back Translation;
Multilingual Machine Translation;
Quality Estimation;
D O I:
10.1007/978-981-97-5672-8_26
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
Prior researches in word-level machine translation quality estimation (QE) have made significant strides in detecting superfluous and omitted translations. Nevertheless, these approaches rely heavily on extensive reference data and struggle to effectively differentiate between superfluous translations, missing translations and mistranslations, resulting in lower detection probabilities. To address this limitation, we propose an Automatic Reference-Free Fine-Grained Neural Machine Translation Error Detection method (ARFGED) that leverages Named Entity Recognition and Back-Translation. A Named Entity Recognition (NER) tool is utilized to get initial error types probability related to entity translation. Back-translation inference is applied to the multilingual machine translation model to obtain fine-grained error types, achieving automatic and reference-free translation error detection. Subsequently, the combination of two error types above are used to train a classifier for clearer distinction between superfluous translations, omissions and incorrect translations. Experimental results on original dataset and our synthetic dataset demonstrate that the proposed method achieves significant improvements in F1 scores compared to supervised and contrastive conditioning methods.
机构:
Unbabel Lisbon, Lisbon, Portugal
Inst Telecomunicacoes, Lisbon, Portugal
Univ Paris Saclay, MICS, Cent Supelec, Paris, France
Univ Lisbon, Inst Super Tecn, Lisbon, PortugalUnbabel Lisbon, Lisbon, Portugal
Guerreiro, Nuno M.
Rei, Ricardo
论文数: 0引用数: 0
h-index: 0
机构:
Unbabel Lisbon, Lisbon, Portugal
INESC ID, Lisbon, Portugal
Univ Lisbon, Inst Super Tecn, Lisbon, PortugalUnbabel Lisbon, Lisbon, Portugal
Rei, Ricardo
van Stigt, Daan
论文数: 0引用数: 0
h-index: 0
机构:
Unbabel Lisbon, Lisbon, PortugalUnbabel Lisbon, Lisbon, Portugal
van Stigt, Daan
Coheur, Luisa
论文数: 0引用数: 0
h-index: 0
机构:
INESC ID, Lisbon, Portugal
Univ Lisbon, Inst Super Tecn, Lisbon, PortugalUnbabel Lisbon, Lisbon, Portugal
Coheur, Luisa
Colombo, Pierre
论文数: 0引用数: 0
h-index: 0
机构:
Univ Paris Saclay, MICS, Cent Supelec, Paris, FranceUnbabel Lisbon, Lisbon, Portugal
Colombo, Pierre
Martins, Andre F. T.
论文数: 0引用数: 0
h-index: 0
机构:
Unbabel Lisbon, Lisbon, Portugal
Inst Telecomunicacoes, Lisbon, Portugal
Univ Lisbon, Inst Super Tecn, Lisbon, PortugalUnbabel Lisbon, Lisbon, Portugal
机构:
Zhengzhou Business Univ, Foreign Languages Sch, Gongyi 450012, Henan, Peoples R ChinaZhengzhou Business Univ, Foreign Languages Sch, Gongyi 450012, Henan, Peoples R China