The annotation guidelines of the Latin Dependency Treebank and Index Thomisticus Treebank The treatment of some specific syntactic constructions in Latin

被引:0
|
作者
Bamman, David [1 ]
Passarotti, Marco [2 ]
Busa, Roberto [2 ]
Crane, Gregory [1 ]
机构
[1] Tufts Univ, Perseus Project, Medford, MA 02155 USA
[2] Univ Cattolica Sacro Cuore, I-20123 Milan, Italy
基金
美国国家科学基金会;
关键词
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
The paper describes the treatment of some specific syntactic constructions in two treebanks of Latin according to a common set of annotation guidelines. Both projects work within the theoretical framework of Dependency Grammar, which has been demonstrated to be an especially appropriate framework for the representation of languages with a moderately free word order, where the linear order of constituents is broken up with elements of other constituents. The two projects are the first of their kind for Latin, so no prior established guidelines for syntactic annotation are available to rely on. The general model for the adopted style of representation is that used by the Prague Dependency Treebank, with departures arising from the Latin grammar of Pinkster, specifically in the traditional grammatical categories of the ablative absolute, the accusative + infinitive, and gerunds/gerundives. Sharing common annotation guidelines allows us to compare the datasets of the two treebanks for tasks such as mutually checking annotation consistency, diachronically studying specific syntactic constructions, and training statistical dependency parsers.
引用
收藏
页码:71 / 76
页数:6
相关论文
共 9 条
  • [1] Syntactic Annotation Guidelines for the Quranic Arabic Dependency Treebank
    Dukes, Kais
    Atwell, Eric
    Sharaf, Abdul-Baquee M.
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1822 - 1827
  • [2] The Index Thomisticus Treebank Project: Annotation, Parsing and Valency Lexicon
    McGillivray, Barbara
    Passarotti, Marco
    Ruffolo, Paolo
    TRAITEMENT AUTOMATIQUE DES LANGUES, 2009, 50 (02): : 103 - 127
  • [3] Late Latin Charter Treebank: contents and annotation
    Korkiakangas, Timo
    CORPORA, 2021, 16 (02) : 191 - 203
  • [4] Lemmatization and morphological analysis for the Latin Dependency Treebank
    Celano, Giuseppe G. A.
    STUDI E SAGGI LINGUISTICI, 2020, 58 (01): : 21 - 38
  • [5] Improvements in Parsing the Index Thomisticus Treebank. Revision, Combination and a Feature Model for Medieval Latin
    Passarotti, Marco
    Dell'Orletta, Felice
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1964 - 1971
  • [6] Syntactic Annotation in the I3rab Dependency Treebank
    Halabi, Dana
    Awajan, Arafat
    Fayyoumi, Ebaa
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2021, 18 (3A) : 381 - 392
  • [7] Constructions in Latvian Treebank: the Impact of Annotation Decisions on the Dependency Parsing Performance
    Pretkalnina, Lauma
    Rituma, Laura
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2014, 2014, 268 : 219 - 226
  • [8] Leaving Behind the Less-Resourced Status. The Case of Latin through the Experience of the Index Thomisticus Treebank
    Passarotti, Marco
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : G27 - G32
  • [9] Croatian Dependency Treebank 2.0: New Annotation Guidelines for Improved Parsing
    Agic, Zeljko
    Berovic, Dasa
    Merkler, Danijela
    Tadic, Marko
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 2313 - 2319