Building The Sense-Tagged Multilingual Parallel Corpus

被引:0
|
作者
Wang, Shan [1 ]
Bond, Francis [1 ]
机构
[1] Nanyang Technol Univ, Div Linguist & Multilingual Studies, Singapore, Singapore
来源
LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION | 2014年
关键词
sense-tagging; multilingual corpus; parallel corpus;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Sense-annotated parallel corpora play a crucial role in natural language processing. This paper introduces our progress in creating such a corpus for Asian languages using English as a pivot, which is the first such corpus for these languages (Chinese, Japanese and Indonesian). Two sets of tools have been developed for sequential and targeted tagging, which are also easy to be set up for any new languages. This paper also briefly presents the general guidelines for doing this project. The current results of the monolingual sense-tagging and multilingual linking are illustrated, which indicate the differences among genres and language pairs. All the tools, guidelines and the manually annotated corpus will be freely available at http://compling.ntu.edu.sg/ntumc.
引用
收藏
页码:2403 / 2409
页数:7
相关论文
共 50 条
  • [41] Building a parallel bilingual syntactically annotated corpus
    Curín, J
    Cmejrek, M
    Havelka, J
    Kubon, V
    NATURAL LANGUAGE PROCESSING - IJCNLP 2004, 2005, 3248 : 168 - 176
  • [42] Building a discourse-tagged corpus in the framework of rhetorical structure theory
    Carlson, L
    Marcu, D
    Okurowski, ME
    CURRENT AND NEW DIRECTIONS IN DISCOURSE AND DIALOGUE, 2003, 22 : 85 - 112
  • [43] Multilingual Corpora and Multilingual Corpus Analysis
    Vyatkina, Nina
    LANGUAGE LEARNING & TECHNOLOGY, 2014, 18 (02): : 70 - 74
  • [44] Multilingual Corpora and Multilingual Corpus Analysis
    Zeldes, Amir
    LANGUAGES IN CONTRAST, 2014, 14 (02) : 316 - 320
  • [45] Multilingual Corpora and Multilingual Corpus Analysis
    Fu, Rongbo
    AUSTRALIAN JOURNAL OF LINGUISTICS, 2017, 37 (01) : 105 - 109
  • [46] Multilingual corpora and multilingual corpus analyses
    Beinborn, Lisa
    INTERNATIONAL JOURNAL OF MULTILINGUALISM, 2014, 11 (02) : 266 - 268
  • [47] EUROSENSE: Automatic Harvesting of Multilingual Sense Annotations from Parallel Text
    Delli Bovi, Claudio
    Camacho-Collados, Jose
    Raganato, Alessandro
    Navigli, Roberto
    PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 2, 2017, : 594 - 600
  • [48] Multilingual corpora and multilingual corpus anaysis
    Bale, Richard
    CALICO JOURNAL, 2013, 30 (03): : 446 - 448
  • [49] THE ALIGNMENT OF A PARALLEL MULTILINGUAL CORPUS: PROPOSED PHASES FOR THE INVERSE SPECIALIZED TRANSLATION DIDACTICS
    Castillo Rodriguez, Cristina
    CADERNOS DE TRADUCAO, 2011, 27 (01): : 117 - 140
  • [50] The JOKER Corpus: English-French Parallel Data for Multilingual Wordplay Recognition
    Ermakova, Liana
    Bosser, Anne-Gwenn
    Jatowt, Adam
    Miller, Tristan
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2796 - 2806