A study of damp-heat syndrome classification Using Word2vec and TF-IDF

被引:0
|
作者
Zhu, Wei [1 ]
Zhang, Wei [1 ]
Li, Guo-Zheng [1 ]
He, Chong [1 ]
Zhang, Lei [2 ]
机构
[1] Tongji Univ, Dept Control Sci & Engn, Shanghai 201804, Peoples R China
[2] Chinese Med Sci, China Acad, Inst Basic Res Clin Med, Beijing 100700, Peoples R China
关键词
Clinical record analysis; Word2vec; TF-IDF; TCM; Damp-heat syndrome Classification;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
With people's increasing concern about health, judging people's health through medical record is becoming a potential demand. Most of preview disease analysis researches were conducted on structured dataset, which usually ignored the relationship between different symptoms, and the dataset was expensive to get. In this paper, we proposed a novel model based on Word2vec and Terms Frequency-Inverse Document Frequency (TF-IDF), which could be used to detect damp-heat syndrome on unstructured records directly. Firstly, we adopt ICTCLAS system combined with corpus collected in the field of Traditional Chinese Medicine (TCM) to segment the clinical records into words. Secondly, Word2vec tool was used to train word vector. Then, we constructed the record representation vector according to word vector and TF-IDF. The record representation method was named Word2vec+ TF-IDF. In order to verify the effectiveness of the proposed method, we compared our record representation method with other text representation methods under four different classifiers. The experiment was conducted on the dataset collected from over 10 Chinese Medicine hospitals. And the experimental results show that our model perform better than the state-of-theart methods such as LSA and Doc2vec.
引用
收藏
页码:1415 / 1420
页数:6
相关论文
共 50 条
  • [41] Text Classification Research Based on Improved Word2vec and CNN
    Gao, Mengyuan
    Li, Tinghui
    Huang, Peifang
    SERVICE-ORIENTED COMPUTING, ICSOC 2018, 2019, 11434 : 126 - 135
  • [42] Turkish Document Classification Based on Word2Vec and SVM Classifier
    Sahin, Gurkan
    2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [43] Classification Turkish SMS with Deep Learning Tool Word2Vec
    Karasoy, Onur
    Balli, Serkan
    2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 294 - 297
  • [44] Polarity Classification for Target Phrases in Tweets: A Word2Vec Approach
    Rexha, Andi
    Kroell, Mark
    Dragoni, Mauro
    Kern, Roman
    SEMANTIC WEB, ESWC 2016, 2016, 9989 : 217 - 223
  • [45] Diet Health Text Classification Based on word2vec and LSTM
    Zhao M.
    Du H.
    Dong C.
    Chen C.
    Nongye Jixie Xuebao/Transactions of the Chinese Society for Agricultural Machinery, 2017, 48 (10): : 202 - 208
  • [46] Chinese comments sentiment classification based on word2vec and SVMperf
    Zhang, Dongwen
    Xu, Hua
    Su, Zengcai
    Xu, Yunfeng
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (04) : 1857 - 1863
  • [47] Text classification based on word2vec and convolutional neural networks
    Fan, Xiaojing
    Jiang, Mingyang
    Pei, Zhili
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2019, 125 : 77 - 78
  • [48] Research on patent text classification based on Word2Vec and LSTM
    Xiao, Lizhong
    Wang, Guangzhong
    Zuo, Yang
    2018 11TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 1, 2018, : 71 - 74
  • [49] Automated Classification of Exchange Information Requirements for Construction Projects Using Word2Vec and SVM
    Mitera-Kielbasa, Ewelina
    Zima, Krzysztof
    INFRASTRUCTURES, 2024, 9 (11)
  • [50] Using Word2Vec Recommendation for Improved Purchase Prediction
    Esmeli, Ramazan
    Bader-El-Den, Mohamed
    Abdullahi, Hassana
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,