Error-Annotated Corpus of Latvian

被引:7
|
作者
Deksne, Daiga [1 ]
Skadina, Inguna [1 ]
机构
[1] Tilde SIA, Riga, Latvia
关键词
Error classification; corpus annotation; error annotated corpus; grammar checking; Latvian language;
D O I
10.3233/978-1-61499-442-8-163
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper reports on the development of the annotated Latvian language error corpus designed for grammar checker development and evaluation. We describe the error classification system introduced for this purpose, the annotation process, and guidelines. Two corpora (the corpus of student papers and the balanced text corpus) consisting of a total of 20,877 sentences have been created and annotated. A general characterisation of the corpora and a summary of the annotation results are presented.
引用
收藏
页码:163 / 166
页数:4
相关论文
共 50 条
  • [1] EAGLE: an Error-Annotated Corpus of Beginning Learner German
    Boyd, Adriane
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1897 - 1902
  • [2] Croatian Error-Annotated Corpus of Non-Professional Written Language
    Stefanec, Vanja
    Ljubesic, Nikola
    Kraljevic, Jelena Kuvac
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3220 - 3226
  • [3] Using an Error-Annotated Learner Corpus to Develop an ESL/EFL Error Correction System
    Han, Na-Rae
    Tetreault, Joel
    Lee, Soo-Hwa
    Ha, Jin-Young
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010,
  • [4] Source language difficulties in learner translation Evidence from an error-annotated corpus
    Kunilovskaya, Maria
    Ilyushchenya, Tatyana
    Morgoun, Natalia
    Mitkov, Ruslan
    TARGET-INTERNATIONAL JOURNAL OF TRANSLATION STUDIES, 2023, 35 (01) : 34 - 62
  • [5] DEVELOPMENT AND EVALUATION OF ADAPTIVE ENGLISH PRACTICING SYSTEM FROM ERROR-ANNOTATED LEARNER CORPUS
    Thisayakorn, P.
    Nishihara, A.
    7TH INTERNATIONAL TECHNOLOGY, EDUCATION AND DEVELOPMENT CONFERENCE (INTED2013), 2013, : 6397 - 6403
  • [6] Terra: a Collection of Translation Error-Annotated Corpora
    Fishel, Mark
    Bojar, Ondrej
    Popovic, Maja
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 7 - 14
  • [7] Scored and Error-annotated Essay Dataset of Chinese EFL/ESL Learners
    Jin, Kai
    Liu, Wuying
    2021 5TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2021, 2021, : 102 - 108
  • [8] Designing an Annotated Longitudinal Latvian Children's Speech Corpus
    Auzina, Ilze
    Levane-Petrova, Kristine
    Rabante-Busa, Guna
    Dargis, Roberts
    Fabregas, Antonio
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2016, 289 : 46 - 50
  • [9] Latvian Tweet Corpus and Investigation of Sentiment Analysis for Latvian
    Pinnis, Marcis
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, BALTIC HLT 2018, 2018, 307 : 112 - 119
  • [10] On-line error detection of annotated corpus using modular neural networks
    Ma, Q
    Lu, BL
    Murata, M
    Ichikawa, M
    Isahara, H
    ARTIFICIAL NEURAL NETWORKS-ICANN 2001, PROCEEDINGS, 2001, 2130 : 1185 - 1192