Improve Word Mover's Distance with Part-of-Speech Tagging

被引:0
|
作者
Chen, Xiaojun [1 ]
Bai, Li [2 ]
Wang, Dakui [1 ]
Shi, Jinqiao [1 ]
机构
[1] Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Inst Informat Engn, Sch Cyber Secur, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Word Mover's Distance (WMD) is a document distance metric with free parameter, intelligible interpretation and unprecedented accuracy on document classification. WMD is on the basis of word embedding and largely focuses on semantic relationships rather than syntactic relationships, which would bring some limitations on measuring document distance. To enhance the impact of syntactic information, we proposed a new method called WMD with Part-of-Speech (PWMD) that integrates part-of-speech (POS) into the original WMD model. POS is a kind of syntactic information, providing more valuable features combined with WMD in document distance metric. Two combination strategies of the POS tagging are provided in PWMD, "word level" and "document level". The results of contrastive experiments have shown that the PWMD is able to get better document distance than WMD.
引用
收藏
页码:3722 / 3728
页数:7
相关论文
共 50 条
  • [31] High performance part-of-speech tagging of Bulgarian
    Doychinova, V
    Mihov, S
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, PROCEEDINGS, 2004, 3192 : 246 - 255
  • [32] Dual Decomposition for Vietnamese Part-of-Speech Tagging
    Bach, Ngo Xuan
    Hiraishi, Kunihiko
    Le Minh, Nguyen
    Shimazu, Akira
    17TH INTERNATIONAL CONFERENCE IN KNOWLEDGE BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS - KES2013, 2013, 22 : 123 - 131
  • [33] Part-of-speech tagging using genetic algorithms
    Department of Computer Science and Engineering, Lovely Professional University, Jalandhar
    Punjab, India
    Int. J. Simul. Syst. Sci. Technol., 6 (11.1-11.7):
  • [34] On Certain Aspects of Kazakh Part-of-Speech Tagging
    Makazhanov, Aibek
    Yessenbayev, Zhandos
    Sabyrgaliyev, Islam
    Sharafudinov, Anuar
    Makhambetov, Olzhas
    2014 IEEE 8TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2014, : 240 - 243
  • [35] Part-of-Speech Tagging Using Evolutionary Computation
    Silva, Ana Paula
    Silva, Arlindo
    Rodrigues, Irene
    NATURE INSPIRED COOPERATIVE STRATEGIES FOR OPTIMIZATION (NICSO 2013), 2014, 512 : 167 - +
  • [36] Part-of-Speech (POS) Tagging for the Nyishi Language
    Siram, Joyir
    Sambyo, Koj
    Sarkar, Achyuth
    ADVANCES IN INFORMATION COMMUNICATION TECHNOLOGY AND COMPUTING, AICTC 2021, 2022, 392 : 191 - 199
  • [37] Part-of-speech tagging for table of contents recognition
    Belaïd, A
    Pierron, L
    Valverde, N
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 4, PROCEEDINGS: APPLICATIONS, ROBOTICS SYSTEMS AND ARCHITECTURES, 2000, : 451 - 454
  • [38] Part-of-Speech Tagging Using Multiview Learning
    Lim, Kyungtae
    Park, Jungyeul
    IEEE ACCESS, 2020, 8 : 195184 - 195196
  • [39] FarsiTag: A part-of-speech tagging system for Persian
    Rezai, Mohammad Javad
    Miangah, Tayebeh Mosavi
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2017, 32 (03) : 632 - 642
  • [40] Part-of-speech tagging with two sequential transducers
    Kempe, A
    COMPUTATIONAL LINGUISTICS IN THE NETHERLANDS 2000, 2001, (37): : 88 - 96