Zuo Zhuan Ancient Chinese Dataset for Word Sense Disambiguation

被引:0
|
作者
Pan, Xiaomeng [1 ]
Wang, Hongfei [1 ]
Oka, Teruaki [1 ]
Komachi, Mamoru [1 ]
机构
[1] Tokyo Metropolitan Univ, Tokyo, Japan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Word Sense Disambiguation (WSD) is a core task in Natural Language Processing (NLP). Ancient Chinese has rarely been used in WSD tasks, however, as no public dataset for ancient Chinese WSD tasks exists. Creation of an ancient Chinese dataset is considered a significant challenge because determining the most appropriate sense in a context is difficult and time-consuming owing to the different usages in ancient and modern Chinese. Actually, no public dataset for ancient Chinese WSD tasks exists. To solve the problem of ancient Chinese WSD, we annotate part of Pre-Qin (221 BC) text Zuo Zhuan using a copyright-free dictionary to create a public sense-tagged dataset. Then, we apply a simple Nearest Neighbors (k-NN) method using a pre-trained language model to the dataset. Our code and dataset will be available on GitHub(1).
引用
收藏
页码:129 / 135
页数:7
相关论文
共 50 条
  • [31] Word sense disambiguation methods
    D. Yu. Turdakov
    Programming and Computer Software, 2010, 36 : 309 - 326
  • [32] Word sense disambiguation model
    Zhu, Jing-bo
    Yao, Tian-shun
    Dongbei Daxue Xuebao/Journal of Northeastern University, 2000, 21 (05): : 484 - 486
  • [33] Probabilistic word sense disambiguation
    Preiss, J
    COMPUTER SPEECH AND LANGUAGE, 2004, 18 (03): : 319 - 337
  • [34] ARABIC WORD SENSE DISAMBIGUATION
    Merhbene, Laroussi
    Zouaghi, Anis
    Zrigui, Mounir
    ICAART 2010: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1: ARTIFICIAL INTELLIGENCE, 2010, : 652 - 655
  • [35] Trends in word sense disambiguation
    R. V. Vidhu Bhala
    S. Abirami
    Artificial Intelligence Review, 2014, 42 : 159 - 171
  • [36] Word sense disambiguation with pictures
    Barnard, K
    Johnson, M
    ARTIFICIAL INTELLIGENCE, 2005, 167 (1-2) : 13 - 30
  • [37] Word Sense Disambiguation for Assamese
    Sarmah, Jumi
    Sarma, Shikhar Kr
    2016 IEEE 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC), 2016, : 146 - 151
  • [38] Soft Word Sense Disambiguation
    Ramakrishnan, Ganesh
    Prithviraj, B. P.
    Deepa, A.
    Bhattacharyya, Pushpak
    Chakrabarti, Soumen
    GWC 2004: SECOND INTERNATIONAL WORDNET CONFERENCE, PROCEEDINGS, 2003, : 291 - 298
  • [39] Word Sense Disambiguation for Turkish
    Mert, Ezgi
    Dalkilic, Goekhan
    2009 24TH INTERNATIONAL SYMPOSIUM ON COMPUTER AND INFORMATION SCIENCES, 2009, : 205 - 210
  • [40] Smoothing and Word Sense Disambiguation
    Agirre, E
    Martinez, D
    ADVANCES IN NATURAL LANGUAGE PROCESSING, 2004, 3230 : 360 - 371