Annotation of specialized corpora using a comprehensive entity and relation scheme

被引:0
|
作者
Deleger, Louise [1 ]
Ligozat, Anne-Laure [1 ,2 ]
Grouin, Cyril [1 ]
Zweigenbaum, Pierre [1 ]
Neveol, Aurelie [1 ]
机构
[1] CNRS, UPR 3251, LIMSI, F-91403 Orsay, France
[2] ENSIIE, F-91000 Evry, France
关键词
Annotation; Clinical Texts; Natural Language Processing; INFORMATION; CORPUS;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Annotated corpora are essential resources for many applications in Natural Language Processing. They provide insight on the linguistic and semantic characteristics of the genre and domain covered, and can be used for the training and evaluation of automatic tools. In the biomedical domain, annotated corpora of English texts have become available for several genres and subfields. However, very few similar resources are available for languages other than English. In this paper we present an effort to produce a high-quality corpus of clinical documents in French, annotated with a comprehensive scheme of entities and relations. We present the annotation scheme as well as the results of a pilot annotation study covering 35 clinical documents in a variety of subfields and genres. We show that high inter-annotator agreement can be achieved using a complex annotation scheme.
引用
收藏
页码:1267 / 1274
页数:8
相关论文
共 50 条
  • [1] SpatialML: Annotation Scheme, Corpora, and Tools
    Mani, Inderjeet
    Hitzeman, Janet
    Richer, Justin
    Harris, Dave
    Quimby, Rob
    Wellner, Ben
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 410 - 415
  • [2] Unsupervised Relation Extraction in Specialized Corpora Using Sequence Mining
    Gabor, Kata
    Zargayouna, Haifa
    Tellier, Isabelle
    Buscaldi, Davide
    Charnois, Thierry
    ADVANCES IN INTELLIGENT DATA ANALYSIS XV, 2016, 9897 : 237 - 248
  • [3] Annotation of Financial Entities Using A Comprehensive Scheme in Turkish
    Adali, Kubra
    Tantug, A. Cuneyd
    2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2022,
  • [4] Annotation Scheme for Named Entity Recognition and Relation Extraction Tasks in the Domain of People with Dementia
    Suravee, Sumaiya
    Stoev, Teodor
    Schindler, David
    Hochgraeber, Iris
    Pinkert, Christiane
    Holle, Bernhard
    Halek, Margareta
    Krueger, Frank
    Yordanova, Kristina
    2022 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS AND OTHER AFFILIATED EVENTS (PERCOM WORKSHOPS), 2022,
  • [5] Towards an annotation scheme for complex laughter in speech corpora
    Truong, Khiet P.
    Trouvain, Juergen
    Jansen, Michel-Pierre
    INTERSPEECH 2019, 2019, : 529 - 533
  • [6] Semi-structured Document Annotation Using Entity and Relation Types
    Kundu, Arpita
    Ghosh, Subhasish
    Bhattacharya, Indrajit
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT III, 2021, 12977 : 52 - 68
  • [7] Issues underlying a common Sign Language Corpora annotation scheme
    Balvet, Antonio
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : A15 - A18
  • [8] A novel entity joint annotation relation extraction model
    Meng Xu
    Dechang Pi
    Jianjun Cao
    Shuilian Yuan
    Applied Intelligence, 2022, 52 : 12754 - 12770
  • [9] Typed Entity and Relation Annotation on Computer Science Papers
    Tateisi, Yuka
    Ohta, Tomoko
    Miyao, Yusuke
    Pyysalo, Sampo
    Aizawa, Akiko
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3836 - 3843
  • [10] A novel entity joint annotation relation extraction model
    Xu, Meng
    Pi, Dechang
    Cao, Jianjun
    Yuan, Shuilian
    APPLIED INTELLIGENCE, 2022, 52 (11) : 12754 - 12770