Annotating an Arabic Learner Corpus for Error

被引:0
|
作者
Abuhakema, Ghazi [1 ]
Faraj, Reem [1 ]
Feldman, Anna [1 ]
Fitzpatrick, Eileen [1 ]
机构
[1] Montclair State Univ, Montclair, NJ 07043 USA
关键词
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper describes an ongoing project in which we are collecting a learner corpus of Arabic, developing a tagset for error annotation and performing Computer-aided Error Analysis (CEA) on the data. We adapted the French Interlanguage Database FRIDA tagset (Granger, 2003a) to the data. We chose FRIDA in order to follow a known standard and to see whether the changes needed to move from a French to an Arabic tagset would give us a measure of the distance between the two languages with respect to learner difficulty. The current collection of texts, which is constantly growing, contains intermediate and advanced-level student writings. We describe the need for such corpora, the learner data we have collected and the tagset we have developed. We also describe the error frequency distribution of both proficiency levels and the ongoing work.
引用
收藏
页码:1347 / 1350
页数:4
相关论文
共 50 条
  • [1] Error Annotation of the Arabic Learner Corpus A New Error Tagset
    Alfaifi, Abdullah
    Atwell, Eric
    Abuhakema, Ghazi
    LANGUAGE PROCESSING AND KNOWLEDGE IN THE WEB, 2013, 8105 : 14 - 22
  • [2] Annotating Errors in a Hungarian Learner Corpus
    Dickinson, Markus
    Ledbetter, Scott
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1659 - 1664
  • [3] A set of parameters for automatically annotating a Sentiment Arabic Corpus
    Imane, Guellil
    Kareem, Darwish
    Faical, Azouaou
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2019, 15 (05) : 594 - 615
  • [4] Review of Practices of Collecting and Annotating Texts in the Learner Corpus REALEC
    Vinogradova, Olga
    Lyashevskaya, Olga
    TEXT, SPEECH, AND DIALOGUE (TSD 2022), 2022, 13502 : 77 - 88
  • [5] Saudi Learner Translation Corpus: The design and compilation of an English-Arabic learner translation corpus
    Al-Harthi, Maha
    Alsaif, Amal
    Al-Nafjan, Eman
    Alshihri, Fatma
    Saleh, Mahmoud
    PLOS ONE, 2024, 19 (10):
  • [6] Error Tagging in the Lithuanian Learner Corpus
    Ruzaite, Jurate
    Dereskeviciute, Sigita
    Kavaliauskaite-Vilkiniene, Viktorija
    Krivickaite-Leisiene, Egle
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE (HLT 2020), 2020, 328 : 253 - 260
  • [7] Error annotation in a Learner Corpus of Portuguese
    del Rio, Iria
    Mendes, Amalia
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4116 - 4119
  • [8] A learner corpus-based study on error associations
    Diaz-Negrillo, Ana
    Valera, Salvador
    TELLING ELT TALES OUT OF SCHOOL, 2010, 3 : 72 - 82
  • [9] Error Types in the Learner Corpus of the Second Baltic Language
    Znotina, Inga
    RURAL ENVIRONMENT, EDUCATION, PERSONALITY. (REEP), 2018, 11 : 170 - 176
  • [10] Automatic Building of a Large Arabic Spelling Error Corpus
    Aichaoui S.B.
    Hiri N.
    Dahou A.H.
    Cheragui M.A.
    SN Computer Science, 4 (2)