Insight into Multiple References in an MT Evaluation Metric

被引：1

作者：

Qin, Ying ^{[1
]}

Specia, Lucia ^{[2
]}

机构：

[1] Beijing Foreign Studies Univ, Beijing 100089, Peoples R China

[2] Univ Sheffield, Sheffield S10 2TN, S Yorkshire, England

来源：

CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA (CCL 2015) | 2015年 / 9427卷

关键词：

Machine translation evaluation; METEOR metric; Multiple references;

D O I：

10.1007/978-3-319-25816-4_11

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Current evaluation metrics in machine translation (MT) make poor use of multiple reference translations. In this paper we focus on the METEOR metric to gain in-depth insights into how best multiple references can be exploited. Results on five score selection strategies reveal that it is not always wise to choose the best (closest to MT) reference to generate the candidate score. We also propose two weighting approaches by taking into account the recurring information among references. The modified METEOR scores significantly increase the correlation with human judgments on accuracy and fluency evaluation at system level.

引用

页码：131 / 140

页数：10

共 50 条

[1] An evaluation metric for image segmentation of multiple objects
Polak, Mark
Zhang, Hong
Pi, Minghong
IMAGE AND VISION COMPUTING, 2009, 27 (08) : 1223 - 1227
[2] Word Embedding-Based Automatic MT Evaluation Metric using Word Position Information
Echizen'ya, Hiroshi
Araki, Kenji
Hovy, Eduard
2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 1874 - 1883
[3] Understanding Multiple References Citation
Lin, Gege
Hou, Haiyan
Hu, Zhigang
17TH INTERNATIONAL CONFERENCE ON SCIENTOMETRICS & INFORMETRICS (ISSI2019), VOL II, 2019, : 2347 - 2357
[4] Sequence Factorization with Multiple References
Wandelt, Sebastian
Leser, Ulf
PLOS ONE, 2015, 10 (09):
[5] ENVIRONMENTAL EVALUATION OF THE FOREST OF MT FUJI, BASED ON MULTIPLE SATELLITE DATA
SHIOSAKA, K
KONTA, F
NISHIKAWA, H
REMOTE SENSING OF EARTHS SURFACE AND ATMOSPHERE, 1993, 14 (03): : 273 - 276
[6] LEGISLATIVE INSIGHT INTO METRIC CONVERSION ACT
COX, JE
MECHANICAL ENGINEERING, 1977, 99 (05) : 90 - 90
[7] MinKSR: A Novel MT Evaluation Metric for Coordinating Human Translators with the CAT-Oriented Input Method
Huang, Guoping
Zhao, Chunlu
Ma, Hongyuan
Zhou, Yu
Zhang, Jiajun
MACHINE TRANSLATION, 2016, 668 : 1 - 13
[8] YiSi - A unified semantic MT quality evaluation and estimation metric for languages with different levels of available resources
Lo, Chi-kiu
FOURTH CONFERENCE ON MACHINE TRANSLATION (WMT 2019), 2019, : 507 - 513
[9] Investigating Evaluation of Open-Domain Dialogue Systems With Human Generated Multiple References
Gupta, Prakhar
Mehri, Shikib
Zhao, Tiancheng
Pavel, Amy
Eskenazi, Maxine
Bigham, Jeffrey P.
20TH ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2019), 2019, : 379 - 391
[10] Fetal growth: the dilemma of multiple references
Costa, Fabricio Da Silva
Papageorghiou, Aris
Helfer, Talita Micheletti
REVISTA BRASILEIRA DE GINECOLOGIA E OBSTETRICIA, 2015, 37 (08): : 345 - 346

← 1 2 3 4 5 →