A benchmark dataset and evaluation methodology for Chinese zero pronoun translation

被引：1

作者：

Xu, Mingzhou ^{[1
]}

Wang, Longyue ^{[2
]}

Liu, Siyou ^{[3
]}

Wong, Derek F. ^{[1
]}

Shi, Shuming ^{[2
]}

Tu, Zhaopeng ^{[2
]}

机构：

[1] Univ Macau, Dept Comp & Informat Sci, Taipa, Macao, Peoples R China

[2] Tencent, AI Lab, Shenzhen, Peoples R China

[3] Macao Polytech Inst, Sch Languages & Translat, Taipa, Macao, Peoples R China

来源：

LANGUAGE RESOURCES AND EVALUATION | 2023年 / 57卷 / 03期

关键词：

Zero pronoun; Machine translation; Benchmark dataset; Evaluation metric; Discourse;

D O I：

10.1007/s10579-023-09660-5

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The phenomenon of zero pronoun (ZP) has attracted increasing interest in the machine translation community due to its importance and difficulty. However, previous studies generally evaluate the quality of translating ZPs with BLEU score on MT testsets, which is not expressive or sensitive enough for accurate assessment. To bridge the data and evaluation gaps, we propose a benchmark testset and evaluation metric for target evaluation on Chinese ZP translation. The human-annotated testset covers five challenging genres, which reveal different characteristics of ZPs for comprehensive evaluation. We systematically revisit advanced models on ZP translation and identify current challenges for future exploration. We release data, code, and trained models, which we hope can significantly promote research in this field.

引用

页码：1263 / 1293

页数：31

共 50 条

[1] A benchmark dataset and evaluation methodology for Chinese zero pronoun translation
Mingzhou Xu
Longyue Wang
Siyou Liu
Derek F. Wong
Shuming Shi
Zhaopeng Tu
Language Resources and Evaluation, 2023, 57 : 1263 - 1293
[2] Evaluation Dataset for Zero Pronoun in Japanese to English Translation
Shimazu, Sho
Takase, Sho
Nakazawa, Toshiaki
Okazaki, Naoaki
PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3630 - 3634
[3] A Survey on Zero Pronoun Translation
Wang, Longyue
Liu, Siyou
Xu, Mingzhou
Song, Linfeng
Shi, Shuming
Tu, Zhaopeng
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3325 - 3339
[4] A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation
Perazzi, F.
Pont-Tuset, J.
McWilliams, B.
Van Gool, L.
Gross, M.
Sorkine-Hornung, A.
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 724 - 732
[5] Resolving Chinese Zero Pronoun with Word Embedding
Liu, Bingquan
Du, Xinkai
Liu, Ming
Sun, Chengjie
Zheng, Guidong
Zou, Chao
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 828 - 838
[6] One Model to Learn Both: Zero Pronoun Prediction and Translation
Wang, Longyue
Tu, Zhaopeng
Wang, Xing
Shi, Shuming
2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 921 - 930
[7] Chinese Zero Pronoun Resolution: A Chain to Chain Approach
Kong Fang
Zhou Guodong
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 393 - 405
[8] Chinese Zero Pronoun Resolution with Deep Neural Networks
Chen, Chen
Ng, Vincent
PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 778 - 788
[9] One model to learn both: Zero pronoun prediction and translation
Wang, Longyue
Tu, Zhaopeng
Wang, Xing
Shi, Shuming
EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference, 2019, : 921 - 930
[10] Deep Reinforcement Learning for Chinese Zero pronoun Resolution
Yin, Qingyu
Zhang, Yu
Zhang, Weinan
Liu, Ting
Wang, William Yang
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 569 - 578

← 1 2 3 4 5 →