Late Latin Charter Treebank: contents and annotation

被引:2
|
作者
Korkiakangas, Timo [1 ]
机构
[1] Univ Helsinki, POB A215,Unioninkatu 40, Helsinki 00014, Finland
关键词
charter; Early Middle Ages; Italy; Latin; philology; treebank;
D O I
10.3366/cor.2021.0217
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper describes the construction and annotation of the Late Latin Charter Treebank, a set of three dependency treebanks (LLCT1, LLCT2 and LLCT3) which together contain 1,261 Early Medieval Latin documentary texts (i.e., original charters) written in Italy between AD 714 and 1000 (about 594,000 tokens). The paper focusses on matters which a linguistically or philologically inclined user of LLCT needs to know: the criteria on which the charters were selected, the special characteristics of the annotation types utilised, and the geographical and chronological distribution of the data. In addition to normal queries on forms, lemmas, morphology and syntax, complex philological research settings are enabled by the textual annotation layer of LLCT, which indicates abbreviated and damaged words, as well as the formulaic and non-formulaic passages of each charter.
引用
收藏
页码:191 / 203
页数:13
相关论文
共 50 条
  • [41] Post-annotation checking of Prague Dependency Treebank 2.0 data
    Stepanek, Jan
    TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2006, 4188 : 277 - 284
  • [42] Analyzing Text Coherence via Multiple Annotation in the Prague Dependency Treebank
    Rysova, Katerina
    Rysova, Magdalena
    TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 71 - 79
  • [43] Constructions in Latvian Treebank: the Impact of Annotation Decisions on the Dependency Parsing Performance
    Pretkalnina, Lauma
    Rituma, Laura
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2014, 2014, 268 : 219 - 226
  • [44] Latin Vallex. A Treebank-based Semantic Valency Lexicon for Latin
    Passarotti, Marco
    Saavedra, Berta Gonzalez
    Onambele, Christophe
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 2599 - 2606
  • [45] Enhancing the Arabic Treebank: A Collaborative Effort toward New Annotation Guidelines
    Maamouri, Mohamed
    Bies, Ann
    Kulick, Seth
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 3192 - 3196
  • [46] Croatian Dependency Treebank 2.0: New Annotation Guidelines for Improved Parsing
    Agic, Zeljko
    Berovic, Dasa
    Merkler, Danijela
    Tadic, Marko
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2313 - 2319
  • [47] Semantic Annotation for Hybrid Contents
    Chen, Yi-Hui
    Lu, Eric Jui-Lin
    Chiou, Chuei-Yan
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON APPLIED SYSTEM INNOVATION (ICASI), 2016,
  • [48] A New Latin Treebank for Universal Dependencies: Charters between Ancient Latin and Romance Languages
    Cecchini, Flavio Massimiliano
    Korkiakangas, Timo
    Passarotti, Marco
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 933 - 942
  • [49] Developing an Egyptian Arabic Treebank: Impact of Dialectal Morphology on Annotation and Tool Development
    Maamouri, Mohamed
    Bies, Ann
    Kulick, Seth
    Ciul, Michael
    Habash, Nizar
    Eskander, Ramy
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2348 - 2354
  • [50] Lexico-Semantic Annotation of Skladnica Treebank by means of PLWN Lexical Units
    Hajnicz, Elzbieta
    PROCEEDINGS OF THE SEVENTH GLOBAL WORDNET CONFERENCE, GWC 2014, 2014, : 23 - 31