Turkish Document Classification Based on Word2Vec and SVM Classifier

被引:0
|
作者
Sahin, Gurkan [1 ]
机构
[1] Yildiz Tekn Univ, Bilgisayar Muhendisligi Bolumu, Istanbul, Turkey
关键词
document categorization; SVM; word2vec;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this study, Turkish texts belonging to different categories were classified by using word2vec word vectors. Firstly, vectors of the words in all the texts were extracted then, each text was represented in terms of the mean vectors of the words it contains. Texts were classified by SVM and 0.92 F measurement score was obtained for seven different categories. As a result, it was experimentally shown that word2vec is more successful than tf-idf based classification for Turkish document classification.
引用
收藏
页数:4
相关论文
共 50 条
  • [21] Multi-Label Chinese Question Classification Based on Word2vec
    Fan, Zhengyu
    Su, Lei
    Liu, Xi
    Wang, Shuaiyang
    2017 4TH INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2017, : 546 - 550
  • [22] Malware Classification Based on Multilayer Perception and Word2Vec for IoT Security
    Qiao, Yanchen
    Zhang, Weizhe
    Du, Xiaojiang
    Guizani, Mohsen
    ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2022, 22 (01)
  • [23] Word Semantic Similarity Calculation Based on Word2vec
    Jin, Xiaolin
    Zhang, Shuwu
    Liu, Jie
    2018 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES (ICCAIS), 2018, : 12 - 16
  • [24] Chinese Sentiment Classification Using Extended Word2Vec
    张胜
    张鑫
    程佳军
    王晖
    Journal of Donghua University(English Edition), 2016, 33 (05) : 823 - 826
  • [25] Word Clustering based on Word2vec and Semantic Similarity
    Luo Jie
    Wang Qinglin
    Li Yuan
    2014 33RD CHINESE CONTROL CONFERENCE (CCC), 2014, : 517 - 521
  • [26] Study on Tibetan Word Vector based on Word2vec
    Yang, Ning
    Li, Guanyu
    Ding, Hailan
    Gong, Chunwei
    2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187
  • [27] An Word2vec based on Chinese Medical Knowledge
    Zhu, Jiayi
    Ni, Pin
    Li, Yuming
    Peng, Junkun
    Dai, Zhenjin
    Le, Gangmin
    Bai, Xuming
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 6263 - 6265
  • [28] WEIGHTED WORD2VEC BASED ON THE DISTANCE OF WORDS
    Chang, Chia-Yang
    Lee, Shie-Jue
    Lai, Chih-Chin
    PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 2, 2017, : 563 - 568
  • [29] ECG analysis based on Word2Vec model
    Oliinyk, Yurii
    Tereschenko, Andrii
    Baklan, Igor
    Beraudo, Elisa
    IDDM 2021: INFORMATICS & DATA-DRIVEN MEDICINE: PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON INFORMATICS & DATA-DRIVEN MEDICINE (IDDM 2021), 2021, 3038 : 213 - 222
  • [30] Feature Extension for Chinese Short Text Classification Based on LDA and Word2vec
    Sun, Fanke
    Chen, Heping
    PROCEEDINGS OF THE 2018 13TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2018), 2018, : 1189 - 1194