A Coreference Corpus and Resolution System for Dutch

被引:0
|
作者
Hendrickx, Iris [1 ]
Bouma, Gosse [2 ]
Coppens, Frederik [3 ]
Daelemans, Walter [1 ]
Hoste, Veronique [1 ]
Kloosterman, Geert [2 ]
Mineur, Anne-Marie [2 ]
Van der Vloet, Joeri [3 ]
Verschelde, Jean-Luc [3 ]
机构
[1] Univ Antwerp, CNTS, Prinsstr 13, B-2000 Antwerp, Belgium
[2] Univ Groningen, Informat Sci, Groningen, Netherlands
[3] Language & Comp NV, B-9051 Sint Denijs Westrem, Belgium
关键词
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
We present the main outcomes of the COREA project: a corpus annotated with coreferential relations and a coreference resolution system for Dutch. We discuss the annotation of the corpus: the type of annotated relations, the guidelines, the annotation tool and inter-annotator agreement. We also show a visualization of the annotated relations. The standard approach to evaluate a coreference resolution system is to compare the predictions of the system to a hand-annotated gold standard test set (cross-validation). A more practically oriented evaluation is to test the usefulness of coreference relation information in an NLP application. We present results of both types of evalutation. We run experiments with an Information Extraction module for the medical domain, and measure the performance of this module with and without coreference relation information. In a separate experiment we also evaluate the effect of coreference information produced by a simple rule-based coreference module in a Question Answering application.
引用
收藏
页码:144 / 149
页数:6
相关论文
共 50 条
  • [1] Corpus for Coreference Resolution on Scientific Papers
    Chaimongkol, Panot
    Aizawa, Akiko
    Tateisi, Yuka
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3187 - 3190
  • [2] Coreference Resolution and Meaning Representation in a Legislative Corpus
    Pothong, Surawat
    Facundes, Nuttanart
    16TH INTERNATIONAL JOINT SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE PROCESSING (ISAI-NLP 2021), 2021,
  • [3] WikiCREM: A Large Unsupervised Corpus for Coreference Resolution
    Kocijan, Vid
    Camburu, Oana-Maria
    Cretu, Ana-Maria
    Yordanov, Yordan
    Blunsom, Phil
    Lukasiewicz, Thomas
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 4303 - 4312
  • [4] Semantic and syntactic features for Dutch coreference resolution
    Hendrickx, Iris
    Hoste, Veronique
    Daelemans, Walter
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, 2008, 4919 : 351 - +
  • [5] Constructing a cross-document event coreference corpus for Dutch
    De Langhe, Loic
    De Clercq, Orphee
    Hoste, Veronique
    LANGUAGE RESOURCES AND EVALUATION, 2023, 57 (02) : 819 - 848
  • [6] Constructing a cross-document event coreference corpus for Dutch
    Loic De Langhe
    Orphée De Clercq
    Veronique Hoste
    Language Resources and Evaluation, 2023, 57 : 819 - 848
  • [7] MuDoCo: Corpus for Multidomain Coreference Resolution and Referring Expression Generation
    Martin, Scott
    Poddar, Shivani
    Upasani, Kartikeya
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 104 - 111
  • [8] The Extended DIRNDL Corpus as a Resource for Automatic Coreference and Bridging Resolution
    Bjoerkelund, Anders
    Eckart, Kerstin
    Riester, Arndt
    Schauffler, Nadja
    Schweitzer, Katrin
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 3222 - 3228
  • [9] Neural Coreference Resolution for Dutch Parliamentary Documents with the DutchParliament Dataset
    van Heusden, Ruben
    Kamps, Jaap
    Marx, Maarten
    DATA, 2023, 8 (02)
  • [10] Transforming Dutch: Debiasing Dutch Coreference Resolution Systems for Non-binary Pronouns
    van Boven, Goya
    Du, Yupei
    Nguyen, Dong
    PROCEEDINGS OF THE 2024 ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, ACM FACCT 2024, 2024, : 2470 - 2483