Late Latin Charter Treebank: contents and annotation

被引:2
|
作者
Korkiakangas, Timo [1 ]
机构
[1] Univ Helsinki, POB A215,Unioninkatu 40, Helsinki 00014, Finland
关键词
charter; Early Middle Ages; Italy; Latin; philology; treebank;
D O I
10.3366/cor.2021.0217
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper describes the construction and annotation of the Late Latin Charter Treebank, a set of three dependency treebanks (LLCT1, LLCT2 and LLCT3) which together contain 1,261 Early Medieval Latin documentary texts (i.e., original charters) written in Italy between AD 714 and 1000 (about 594,000 tokens). The paper focusses on matters which a linguistically or philologically inclined user of LLCT needs to know: the criteria on which the charters were selected, the special characteristics of the annotation types utilised, and the geographical and chronological distribution of the data. In addition to normal queries on forms, lemmas, morphology and syntax, complex philological research settings are enabled by the textual annotation layer of LLCT, which indicates abbreviated and damaged words, as well as the formulaic and non-formulaic passages of each charter.
引用
收藏
页码:191 / 203
页数:13
相关论文
共 50 条
  • [21] Prague Dependency Treebank Annotation Errors A Preliminary Analysis
    Kovar, Vojtech
    Jakubicek, Milos
    RASLAN 2009: RECENT ADVANCES IN SLAVONIC NATURAL LANGUAGE PROCESSING, 2009, : 101 - 108
  • [22] Consistent and Flexible Integration of Morphological Annotation in the Arabic Treebank
    Kulick, Seth
    Bies, Ann
    Maamouri, Mohamed
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1499 - 1506
  • [23] A dependency-based analysis of treebank annotation errors
    Haverinen, Katri
    Ginter, Filip
    Laippala, Veronika
    Kohonen, Samuel
    Viljanen, Timo
    Nyblom, Jenna
    Salakoski, Tapio
    1600, IOS Press BV (258): : 47 - 61
  • [24] The Procedure of Lexico-Semantic Annotation of Skladnica Treebank
    Hajnicz, Elzbieta
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2290 - 2297
  • [25] Dependency structure annotation in the IULA Spanish LSP Treebank
    Montserrat Marimon
    Núria Bel
    Language Resources and Evaluation, 2015, 49 : 433 - 454
  • [26] Dependency structure annotation in the IULA Spanish LSP Treebank
    Marimon, Montserrat
    Bel, Nuria
    LANGUAGE RESOURCES AND EVALUATION, 2015, 49 (02) : 433 - 454
  • [27] Lemmatization and morphological analysis for the Latin Dependency Treebank
    Celano, Giuseppe G. A.
    STUDI E SAGGI LINGUISTICI, 2020, 58 (01): : 21 - 38
  • [28] Reflections on the Penn Discourse TreeBank, Comparable Corpora, and Complementary Annotation
    Prasad, Rashmi
    Webber, Bonnie
    Joshi, Aravind
    COMPUTATIONAL LINGUISTICS, 2014, 40 (04) : 921 - 950
  • [29] Towards building a Kashmiri Treebank: Setting up the Annotation Pipeline
    Bhat, Riyaz Ahmad
    Bhat, Shahid Mushtaq
    Sharma, Dipti Misra
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 748 - 752
  • [30] The Index Thomisticus Treebank Project: Annotation, Parsing and Valency Lexicon
    McGillivray, Barbara
    Passarotti, Marco
    Ruffolo, Paolo
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2009, 50 (02): : 103 - 127