Using electronic texts for an annotated corpus building

被引:9
|
作者
Galicia-Haro, SN [1 ]
机构
[1] Inst Politecn Nacl, Computat Res Ctr, Nat Language & Text Proc Lab, Mexico City 07738, DF, Mexico
关键词
D O I
10.1109/ENC.2003.1232870
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, collections of texts with annotations on several levels are useful resources. They are employed for diverse tasks in theoretical research and natural language applications. The most important collections are dedicated to English. However, huge efforts are required to develop the corresponding resource for other languages. In this work, we present the initial steps for the compilation of an annotated Mexican corpus using electronic texts obtained from the WEB.
引用
收藏
页码:26 / 32
页数:7
相关论文
共 50 条
  • [41] The Electronic Historical Latvian Dictionary Based on the Corpus of Early Written Latvian Texts
    Andronova, Everita
    Silina-Pinke, Renate
    Trumpa, Anta
    Vanags, Peteris
    ACTA BALTICO-SLAVICA, 2016, 40 : 1 - 37
  • [42] Developing a cardiovascular disease risk factor annotated corpus of Chinese electronic medical records
    Su, Jia
    He, Bin
    Guan, Yi
    Jiang, Jingchi
    Yang, Jinfeng
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2017, 17
  • [43] Corpus of Hermetic texts
    Ferrari, Franco
    ATHENAEUM-STUDI PERIODICI DI LETTERATURA E STORIA DELL ANTICHITA, 2008, 96 (01): : 409 - 411
  • [44] Electronic texts are computations are electronic texts
    Hrachovec, H
    JOURNAL OF PHILOSOPHY OF EDUCATION, 2000, 34 (01) : 169 - 181
  • [45] Building a Bio-Event Annotated Corpus for the Acquisition of Semantic Frames from Biomedical Corpora
    Thompson, Paul
    Cotter, Philip
    Ananiadou, Sophia
    McNaught, John
    Montemagni, Simonetta
    Trabucco, Andrea
    Venturi, Giulia
    SIXTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, LREC 2008, 2008, : 2159 - 2166
  • [46] LDA Topic Modeling for pramana Texts: A Case Study in Sanskrit NLP Corpus Building
    Neill, Tyler
    PROCEEDINGS OF THE 6TH INTERNATIONAL SANSKRIT COMPUTATIONAL LINGUISTICS SYMPOSIUM (ISCLS 2019), 2019, : 53 - 68
  • [47] Building an English-Chinese Parallel Corpus Annotated with Sub-sentential Translation Techniques
    Zhai, Yuming
    Liu, Lufei
    Zhong, Xinyi
    Illouz, Gabriel
    Vilnat, Anne
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 4024 - 4033
  • [48] Building and Evaluating an Annotated Corpus for Automated Recognition of Chat-Based Social Engineering Attacks
    Tsinganos, Nikolaos
    Mavridis, Ioannis
    APPLIED SCIENCES-BASEL, 2021, 11 (22):
  • [49] An Analysis of Sindhi Annotated Corpus using Supervised Machine Learning Methods
    Ali, Mazhar
    Wagan, Asim Imdad
    MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2019, 38 (01) : 185 - 196
  • [50] Building a Sentiment Corpus using a Gamified Framework
    Tiam-Lee, Thomas James
    See, Solomon
    2014 INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY, COMMUNICATION AND CONTROL, ENVIRONMENT AND MANAGEMENT (HNICEM), 2014,