The Chinese Discourse TreeBank: a Chinese corpus annotated with discourse relations

被引:0
|
作者
Yuping Zhou
Nianwen Xue
机构
[1] Brandeis University,
来源
关键词
Discourse TreeBank; Discourse relations; Chinese; Explicit and implicit discourse connectives;
D O I
暂无
中图分类号
学科分类号
摘要
The paper presents the Chinese Discourse TreeBank, a corpus annotated with Penn Discourse TreeBank style discourse relations that take the form of a predicate taking two arguments. We first characterize the syntactic and statistical distributions of Chinese discourse connectives as well as the role of Chinese punctuation marks in discourse annotation, and then describe how we design our annotation strategy procedure based on this characterization. The Chinese-specific features of our annotation strategy include annotating explicit and implicit discourse relations in one single pass, defining the argument labels on semantic, rather than syntactic, grounds, as well as annotating the semantic type of implicit discourse relations directly. We also introduce a flat, 11-valued semantic type classification scheme for discourse relations. We finally demonstrate the feasibility of our approach with evaluation results.
引用
收藏
页码:397 / 431
页数:34
相关论文
共 50 条
  • [1] The Chinese Discourse TreeBank: a Chinese corpus annotated with discourse relations
    Zhou, Yuping
    Xue, Nianwen
    LANGUAGE RESOURCES AND EVALUATION, 2015, 49 (02) : 397 - 431
  • [2] The CUHK Discourse TreeBank for Chinese: Annotating Explicit Discourse Connectives for the Chinese TreeBank
    Zhou, Lanjun
    Li, Binyang
    Wei, Zhongyu
    Wong, Kam-Fai
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 942 - 949
  • [3] Probability Distribution of Discourse Relations Based on a Chinese RST-annotated Corpus
    Yue, Ming
    Liu, Haitao
    JOURNAL OF QUANTITATIVE LINGUISTICS, 2011, 18 (02) : 107 - 121
  • [4] Building a Macro Chinese Discourse Treebank
    Chu, Xiaomin
    Jiang, Feng
    Xu, Sheng
    Zhu, Qiaoming
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 1920 - 1924
  • [5] Function Multiword Expressions Annotated with Discourse Relations in the Romanian Reference Treebank
    Mititelu, Verginica Barbu
    Voicu, Tudor
    PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE COMPUTATIONAL LINGUISTICS IN BULGARIA, CLIB 2024, 2024, : 90 - 97
  • [6] Persian Discourse Treebank and coreference corpus
    Mirzaei, Azadeh
    Safari, Pegah
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4049 - 4055
  • [7] Towards a corpus of student texts annotated in discourse relations
    Bras, Myriam
    Vieu, Laure
    Joret, Maelle
    Pepin-Boutin, Audrey
    Poujade, Clamenca
    Roze, Charlotte
    LANGUE FRANCAISE, 2021, (211): : 115 - 129
  • [8] Explicit and Implicit Discourse Relations in the Prague Discourse Treebank
    Zikanova, Sarka
    Mirovsky, Jiri
    Synkova, Pavlina
    TEXT, SPEECH, AND DIALOGUE (TSD 2019), 2019, 11697 : 236 - 248
  • [9] Corpus for the Legal Information Processing System (CLIPS): A Chinese legal corpus annotated with discourse information
    Wang, Hong
    Ge, Yunfeng
    2017 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2017, : 18 - 22
  • [10] A Study of Recognizing Implicit Discourse Relations in the Penn Discourse Treebank
    Liu, Chu
    Chen, Jin-xiu
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SOFTWARE ENGINEERING (AISE 2014), 2014, : 582 - 587