Using electronic texts for an annotated corpus building

被引:9
|
作者
Galicia-Haro, SN [1 ]
机构
[1] Inst Politecn Nacl, Computat Res Ctr, Nat Language & Text Proc Lab, Mexico City 07738, DF, Mexico
关键词
D O I
10.1109/ENC.2003.1232870
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, collections of texts with annotations on several levels are useful resources. They are employed for diverse tasks in theoretical research and natural language applications. The most important collections are dedicated to English. However, huge efforts are required to develop the corresponding resource for other languages. In this work, we present the initial steps for the compilation of an annotated Mexican corpus using electronic texts obtained from the WEB.
引用
收藏
页码:26 / 32
页数:7
相关论文
共 50 条
  • [21] Building a Corpus of Manually Revised Texts from Discourse Perspective
    Iida, Ryu
    Tokunaga, Takenobu
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 936 - 941
  • [22] Building a Named Entity Annotated Bilingual English-Vietnamese Corpus
    Tuan-An Dao
    Hung-Thinh Truong
    Long Nguyen
    Dien Dinh
    PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2018, : 61 - 66
  • [23] Building a lexicon of French deverbal nouns from a semantically annotated corpus
    Balvet, Antonio
    Barque, Lucie
    Marin, Rafael
    LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 1408 - 1413
  • [24] An efficient tool for building a large part-of-speech annotated corpus
    Lim, HS
    Rim, HC
    IC-AI'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 1-III, 2000, : 1225 - 1229
  • [25] Voices of the Great War: A Richly Annotated Corpus of Italian Texts on the First World War
    Lenci, Alessandro
    Montemagni, Simonetta
    Boschetti, Federico
    De Felice, Irene
    dei Rossi, Stefano
    Dell'Orletta, Felice
    Di Giorgio, Michele
    Miliani, Martina
    Passaro, Lucia C.
    Puddu, Angelica
    Venturi, Giulia
    Labanca, Nicola
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 911 - 918
  • [26] Introducing LCC's NavProc 1.0 Corpus Annotated Procedural Texts in the Naval Domain
    Mohler, Michael
    Lee, Sandra
    Brunson, Mary
    Bracewell, David
    TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT I, 2024, 15048 : 252 - 266
  • [27] Mining a corpus of biographical texts using keywords
    Conway, Mike
    LITERARY AND LINGUISTIC COMPUTING, 2010, 25 (01): : 23 - 35
  • [28] Building a Corpus of Indefinite Uses Annotated with Fine-grained Semantic Functions
    Aloni, Maria
    van Cranenburgh, Andreas
    Fernandez, Raquel
    Sznajder, Marta
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1511 - 1515
  • [29] An Emotional Mess! Deciding on a Framework for Building a Dutch Emotion-Annotated Corpus
    De Bruyne, Luna
    De Clercq, Orphee
    Hoste, Veronique
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 1643 - 1651
  • [30] Building semantically annotated corpus for text classification of Indian defence news articles
    Kanekar S.A.
    Sharma A.
    Patkar G.S.
    Tilve A.K.S.
    International Journal of Information Technology, 2021, 13 (4) : 1539 - 1544