An Annotated Corpus of Direct Speech

被引:0
|
作者
Lee, John [1 ]
Yeung, Chak Yan [1 ]
机构
[1] City Univ Hong Kong, Halliday Ctr Intelligent Applicat Language Studie, Dept Linguist & Translat, Hong Kong, Peoples R China
关键词
direct speech; coreference; corpus annotation;
D O I
暂无
中图分类号
H [语言、文字];
学科分类号
05 ;
摘要
We propose a scheme for annotating direct speech in literary texts, based on the Text Encoding Initiative (TEI) and the coreference annotation guidelines from the Message Understanding Conference (MUC). The scheme encodes the speakers and listeners of utterances in a text, as well as the quotative verbs that reports the utterances. We measure inter-annotator agreement on this annotation task. We then present statistics on a manually annotated corpus that consists of books from the New Testament. Finally, we visualize the corpus as a conversational network.
引用
收藏
页码:1059 / 1063
页数:5
相关论文
共 50 条
  • [21] Sense Annotated Hindi Corpus
    Singh, Satyendr
    Siddiqui, Tanveer J.
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2016, : 22 - 25
  • [22] ChiSense-12: An English Sense-Annotated Child-Directed Speech Corpus
    Cabiddu, Francesco
    Bott, Lewis
    Jones, Gary
    Gambi, Chiara
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 5198 - 5205
  • [23] Construction and Evaluations of an Annotated Chinese Conversational Corpus in Travel Domain for the Language Model of Speech Recognition
    Hu, Xinhui
    Isotani, Ryosuke
    Kawai, Hisashi
    Nakamura, Satoshi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 1910 - 1913
  • [24] HateBR: A Large Expert Annotated Corpus of Brazilian Instagram Comments for Offensive Language and Hate Speech Detection
    Vargas, Francielle
    Carvalho, Isabelle
    Goes, Fabiana
    Pardo, Thiago A. S.
    Benevenuto, Fabricio
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7174 - 7183
  • [25] The Temple University Artifact Corpus: An Annotated Corpus of EEG Artifacts
    Hamid, A.
    Gagliano, K.
    Rahman, S.
    Tulin, N.
    Tchiong, V
    Obeid, I
    Picone, J.
    2020 IEEE SIGNAL PROCESSING IN MEDICINE AND BIOLOGY SYMPOSIUM, 2020,
  • [26] The RareDis corpus: A corpus annotated with rare diseases, their signs and symptoms
    Martinez-deMiguel, Claudia
    Segura-Bedmar, Isabel
    Chacon-Solano, Esteban
    Guerrero-Aspizua, Sara
    JOURNAL OF BIOMEDICAL INFORMATICS, 2022, 125
  • [27] The PsyMine Corpus - A Corpus annotated with Psychiatric Disorders and their Etiological Factors
    Ellendorff, Tilia Renate
    Foster, Simon
    Rinaldi, Fabio
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3723 - 3729
  • [28] An Annotated Urdu Corpus of Handwritten Text Image and Benchmarking of Corpus
    Choudhary, Prakash
    Nain, Neeta
    2014 37TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2014, : 1159 - 1164
  • [29] JAIST Annotated Corpus of Free Conversation
    Shirai, Kiyoaki
    Fukuoka, Tomotaka
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 741 - 748
  • [30] A Semantically Annotated Swedish Medical Corpus
    Kokkinakis, Dimitrios
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 32 - 38