Improve Word Mover's Distance with Part-of-Speech Tagging

被引:0
|
作者
Chen, Xiaojun [1 ]
Bai, Li [2 ]
Wang, Dakui [1 ]
Shi, Jinqiao [1 ]
机构
[1] Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Inst Informat Engn, Sch Cyber Secur, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Word Mover's Distance (WMD) is a document distance metric with free parameter, intelligible interpretation and unprecedented accuracy on document classification. WMD is on the basis of word embedding and largely focuses on semantic relationships rather than syntactic relationships, which would bring some limitations on measuring document distance. To enhance the impact of syntactic information, we proposed a new method called WMD with Part-of-Speech (PWMD) that integrates part-of-speech (POS) into the original WMD model. POS is a kind of syntactic information, providing more valuable features combined with WMD in document distance metric. Two combination strategies of the POS tagging are provided in PWMD, "word level" and "document level". The results of contrastive experiments have shown that the PWMD is able to get better document distance than WMD.
引用
收藏
页码:3722 / 3728
页数:7
相关论文
共 50 条
  • [21] Domain adaptation in part-of-speech tagging
    Institute of Exact and Natural Sciences, Federal University of Pará , Pará, Brazil
    不详
    Emerging Applic. of Nat. Lang. Proc.: Concepts and New Res., (52-72):
  • [22] Part-of-speech tagging without training
    Bressan, S
    Indradjaja, LS
    INTELLIGENCE IN COMMUNICATION SYSTEMS, 2004, 3283 : 112 - 119
  • [23] The Application of CRFs in Part-of-Speech Tagging
    Zhang Xiaofei
    Huang Heyan
    Zhang Liang
    2009 INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS, VOL 2, PROCEEDINGS, 2009, : 347 - +
  • [24] A CONNECTIONIST APPROACH TO PART-OF-SPEECH TAGGING
    Zamora-Martinez, F.
    Castro-Bleda, M. J.
    Espana-Boquera, S.
    Tortajada, Salvador
    Aibar, P.
    IJCCI 2009: PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON COMPUTATIONAL INTELLIGENCE, 2009, : 421 - +
  • [25] Part-of-speech tagging and partial parsing
    Abney, S
    CORPUS-BASED METHODS IN LANGUAGE AND SPEECH PROCESSING, 1997, 2 : 118 - 136
  • [26] Incorporating knowledge for joint Chinese word segmentation and part-of-speech tagging with SynSemGCN
    Tang, Xuemei
    Wang, Jun
    Su, Qi
    ASLIB JOURNAL OF INFORMATION MANAGEMENT, 2024,
  • [27] Automatic Word Segmentation and Part-Of-Speech Tagging for Classical Chinese Based on Radicals
    Chang, Bolin
    Y., Yuan
    B., Li
    Z., Xu
    M., Feng
    D., Wang
    Data Analysis and Knowledge Discovery, 2024, 8 (11) : 102 - 113
  • [28] Part-of-speech tagging with recurrent neural networks
    Pérez-Ortiz, JA
    Forcada, ML
    IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 1588 - 1592
  • [29] Impact of imperfect OCR on part-of-speech tagging
    Lin, XF
    SEVENTH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION, VOLS I AND II, PROCEEDINGS, 2003, : 284 - 288
  • [30] Analyzing Tagging Accuracy of Part-of-Speech Taggers
    Khin, Nyein Pyae Pyae
    Aung, Than Nwe
    GENETIC AND EVOLUTIONARY COMPUTING, VOL II, 2016, 388 : 347 - 354