Improve Word Mover's Distance with Part-of-Speech Tagging

被引:0
|
作者
Chen, Xiaojun [1 ]
Bai, Li [2 ]
Wang, Dakui [1 ]
Shi, Jinqiao [1 ]
机构
[1] Inst Informat Engn, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Inst Informat Engn, Sch Cyber Secur, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Word Mover's Distance (WMD) is a document distance metric with free parameter, intelligible interpretation and unprecedented accuracy on document classification. WMD is on the basis of word embedding and largely focuses on semantic relationships rather than syntactic relationships, which would bring some limitations on measuring document distance. To enhance the impact of syntactic information, we proposed a new method called WMD with Part-of-Speech (PWMD) that integrates part-of-speech (POS) into the original WMD model. POS is a kind of syntactic information, providing more valuable features combined with WMD in document distance metric. Two combination strategies of the POS tagging are provided in PWMD, "word level" and "document level". The results of contrastive experiments have shown that the PWMD is able to get better document distance than WMD.
引用
收藏
页码:3722 / 3728
页数:7
相关论文
共 50 条
  • [41] Part-Of-Speech Tagging for Social Media Texts
    Neunerdt, Melanie
    Trevisan, Bianka
    Reyer, Michael
    Mathar, Rudolf
    LANGUAGE PROCESSING AND KNOWLEDGE IN THE WEB, 2013, 8105 : 139 - 150
  • [42] Improved estimation for unsupervised part-of-speech tagging
    Wang, QI
    Schuurmans, D
    Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05), 2005, : 219 - 224
  • [43] A part-of-speech tagging method for English essay
    1600, Beijing University of Posts and Telecommunications (37):
  • [44] Ripple Down Rules for Part-of-Speech Tagging
    Dat Quoc Nguyen
    Dai Quoc Nguyen
    Son Bao Pham
    Dang Duc Pham
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PT I, 2011, 6608 : 190 - 201
  • [45] Phrase-based part-of-speech tagging
    Finch, Andrew
    Sumita, Eiichiro
    PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING (NLP-KE'07), 2007, : 215 - +
  • [46] A Korean part-of-speech tagging system using resolution rules for individual ambiguous word
    Ahn, Young-Min
    Shin, Seung-Eun
    Park, Hee-Geun
    Ji, Hyungsuk
    Seo, Young-Hoon
    COMPUTATIONAL SCIENCE - ICCS 2007, PT 2, PROCEEDINGS, 2007, 4488 : 1222 - +
  • [47] Correcting word segmentation and part-of-speech tagging errors for Chinese named entity recognition
    Yao, TF
    Wei, D
    Erbach, G
    INTERNET CHALLENGE: TECHNOLOGY AND APPLICATIONS, 2002, : 29 - 36
  • [48] Semi-supervised Part-of-speech Tagging in Speech Applications
    Dufour, Richard
    Favre, Benoit
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1373 - 1376
  • [49] A fine-grained Chinese word segmentation and part-of-speech tagging corpus for clinical text
    Ying Xiong
    Zhongmin Wang
    Dehuan Jiang
    Xiaolong Wang
    Qingcai Chen
    Hua Xu
    Jun Yan
    Buzhou Tang
    BMC Medical Informatics and Decision Making, 19
  • [50] A fine-grained Chinese word segmentation and part-of-speech tagging corpus for clinical text
    Xiong, Ying
    Wang, Zhongmin
    Jiang, Dehuan
    Wang, Xiaolong
    Chen, Qingcai
    Xu, Hua
    Yan, Jun
    Tang, Buzhou
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (Suppl 2)