Building The Sense-Tagged Multilingual Parallel Corpus

被引:0
|
作者
Wang, Shan [1 ]
Bond, Francis [1 ]
机构
[1] Nanyang Technol Univ, Div Linguist & Multilingual Studies, Singapore, Singapore
关键词
sense-tagging; multilingual corpus; parallel corpus;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Sense-annotated parallel corpora play a crucial role in natural language processing. This paper introduces our progress in creating such a corpus for Asian languages using English as a pivot, which is the first such corpus for these languages (Chinese, Japanese and Indonesian). Two sets of tools have been developed for sequential and targeted tagging, which are also easy to be set up for any new languages. This paper also briefly presents the general guidelines for doing this project. The current results of the monolingual sense-tagging and multilingual linking are illustrated, which indicate the differences among genres and language pairs. All the tools, guidelines and the manually annotated corpus will be freely available at http://compling.ntu.edu.sg/ntumc.
引用
收藏
页码:2403 / 2409
页数:7
相关论文
共 50 条
  • [1] DutchSemCor: Targeting the ideal sense-tagged corpus
    Vossen, Piek
    Gorog, Attila
    Izquierdo, Ruben
    van den Bosch, Antal
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 584 - 589
  • [2] Japanese semcor: A sense-tagged corpus of Japanese
    Linguistics and Multilingual Studies, Nanyang Technological University, Singapore
    不详
    不详
    GWC Int. WordNet Conf. Proc., (56-63):
  • [3] Developing parallel sense-tagged corpora with wordnets
    Bond, Francis
    Wang, Shan
    Gao, Eshley Huini
    Mok, Hazel Shuwen
    Tan, Jeanette Yiwen
    LAW 2013 and ID 2013 - 7th Linguistic Annotation Workshop and Interoperability with Discourse, Proceedings of the Workshop, (149-158):
  • [4] Design and prototype of a large-scale and fully sense-tagged corpus
    Ker, Sue-Jin
    Huang, Chu-Ren
    Hong, Jia-Fei
    Liu, Shi-Yin
    Jian, Hui-Ling
    Su, I-Li
    Hsieh, Shu-Kai
    LARGE-SCALE KNOWLEDGE RESOURCES: CONSTRUCTION AND APPLICATION, 2008, 4938 : 186 - +
  • [5] MulTed: a multilingual aligned and tagged parallel corpus
    Zeroual, Imad
    Lakhouaja, Abdelhak
    APPLIED COMPUTING AND INFORMATICS, 2022, 18 (1/2) : 61 - 73
  • [6] SwissAdmin: A multilingual tagged parallel corpus of press releases
    Scherrer, Yves
    Nerima, Luka
    Russo, Lorenza
    Ivanova, Maria
    Wehrli, Eric
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 1832 - 1836
  • [7] Building a multilingual parallel corpus for human users
    Rosen, Alexandr
    Vavrin, Martin
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 2447 - 2452
  • [8] Building and processing a multilingual corpus of parallel texts
    Stahl, P
    PARALLEL CORPORA, PARALLEL WORLDS, 2002, (43): : 169 - 179
  • [9] Building Sense Tagged Corpus Using Wikipedia for Supervised Word Sense Disambiguation
    Saif, Abdulgabbar
    Omar, Nazlia
    Zainodin, Ummi Zakiah
    Ab Aziz, Mohd Juziaddin
    8TH ANNUAL INTERNATIONAL CONFERENCE ON BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES, BICA 2017 (EIGHTH ANNUAL MEETING OF THE BICA SOCIETY), 2018, 123 : 403 - 412
  • [10] Multilingual sense intersection in a parallel corpus with diverse language families
    Bonansinga, Giulia
    Bond, Francis
    Proceedings of the 8th Global WordNet Conference, GWC 2016, 2016, : 44 - 49