Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Parsing and L2-L1 Parallel Data

被引:0
|
作者
Lin, Zi [1 ,2 ,3 ]
Duan, Yuguang [3 ]
Zhao, Yuanyuan [1 ,2 ,5 ]
Sun, Weiwei [1 ,2 ,4 ]
Wan, Xiaojun [1 ,2 ]
机构
[1] Peking Univ, Inst Comp Sci & Technol, Beijing, Peoples R China
[2] Peking Univ, MOE Key Lab Computat Linguist, Beijing, Peoples R China
[3] Peking Univ, Dept Chinese Language & Literature, Beijing, Peoples R China
[4] Peking Univ, Ctr Chinese Linguist, Beijing, Peoples R China
[5] Peking Univ, Acad Adv Interdisciplinary Studies, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies semantic parsing for interlanguage (L2(1)), taking semantic role labeling (SRL) as a case task and learner Chinese as a case language. We first manually annotate the semantic roles for a set of learner texts to derive a gold standard for automatic SRL. Based on the new data, we then evaluate three off-the-shelf SRL systems, i.e., the PCFGLA-parser-based, neural-parserbased and neural-syntax-agnostic systems, to gauge how successful SRL for learner Chinese can be. We find two non-obvious facts: 1) the L1-sentence-trained systems performs rather badly on the L2 data; 2) the performance drop from the L1 data to the L2 data of the two parser-based systems is much smaller, indicating the importance of syntactic parsing in SRL for interlanguages. Finally, the paper introduces a new agreement-based model to explore the semantic coherency information in the large-scale L2-L1 parallel data. We then show such information is very effective to enhance SRL for learner texts. Our model achieves an F-score of 72.06, which is a 2.02 point improvement over the best baseline.
引用
收藏
页码:3793 / 3802
页数:10
相关论文
共 31 条