Topic recommendation using Doc2Vec

被引:0
|
作者
Karvelis, Petros [1 ]
Gavrilis, Dimitris [2 ]
Georgoulas, George [3 ]
Stylios, Chrysostomos [1 ]
机构
[1] Technol Educ Inst Epirus, Lab Knowledge & Intelligent Comp, Dept Comp Engn, Arta, Greece
[2] Univ Patras, Dept Elect Engn & Comp Technol, Patras, Greece
[3] Lulea Univ Technol, Control Engn Grp, Dept Comp Sci Elect & Space Engn, Lulea, Sweden
关键词
Recommender system; multilabel classification; word2vec; doc2vec; bag of words;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ever-increasing number of electronic content stored in digital libraries requires a significant amount of effort in cataloguing and has led to self-deposit solutions where the authors submit and publish their own digital records. Even in self-deposit, going through the abstract and assigning subject terms or keywords is a time consuming and expensive process, yet crucial for the metadata quality of the record that affects retrieval. Therefore, an automatic, or even a semi-automatic process that can recommend topics for a new entry is of huge practical value. A system that can address that has to rely basically on two components, one component for efficiently representing the relevant information of the new document and one component for recommending an appropriate set of topics based on the representation of the previous stage. In this work, different candidate solutions for both components are investigated and compared. For the first stage both distributed Document to Vector (doc2vec) and conventional Bag of Words (BoW) components are employed, while for the latter two different transformation approaches from the field of multi-label classification are compared. For the comparison, a collection of Ph.D. abstracts (similar to 19000 documents) from the MIT Libraries Dspace repository is used suggesting that different combinations can provide high quality solutions.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Deep Learning Based Classification Using Academic Studies in Doc2Vec Model
    Safali, Yasar
    Nergiz, Gozde
    Avaroglu, Erdinc
    Dogan, Emre
    2019 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND DATA PROCESSING (IDAP 2019), 2019,
  • [22] Research on detection methods based on Doc2vec abnormal comments
    Chang, Wenbing
    Xu, Zhenzhong
    Zhou, Shenghan
    Cao, Wen
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2018, 86 : 656 - 662
  • [23] Chinese Text Keyword Extraction Based on Doc2vec And TextRank
    Wang, Wei
    Li, Xiangshun
    Yu, Sheng
    PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 369 - 373
  • [24] Compressed Firmware Classification Based on Extra Trees and Doc2Vec
    Qiu, Jing
    Geng, Xiaoxu
    Sun, Guanglu
    SCIENTIFIC PROGRAMMING, 2021, 2021
  • [25] Sentiment Analysis on Chinese Hotel Reviews with Doc2Vec and Classifiers
    Shuai, Qianjun
    Huang, Yamei
    Jin, Libiao
    Pang, Long
    PROCEEDINGS OF 2018 IEEE 3RD ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC 2018), 2018, : 1171 - 1174
  • [26] Automated Scoring of Interview Videos using Doc2Vec Multimodal Feature Extraction Paradigm
    Chen, Lei
    Feng, Gary
    Leong, Chee Wee
    Lehman, Blair
    Martin-Raugh, Michelle
    Kell, Harrison
    Lee, Chong Min
    Yoon, Su-Youn
    ICMI'16: PROCEEDINGS OF THE 18TH ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2016, : 161 - 168
  • [27] Long-term Performance of a Generic Intrusion Detection Method Using Doc2vec
    Mimura, Mamoru
    Tanaka, Hidema
    2017 FIFTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2017, : 456 - 462
  • [28] Automated Functional Dependency Detection Between Test Cases Using Doc2Vec and Clustering
    Tahvili, Sahar
    Hatvani, Leo
    Felderer, Michael
    Afzal, Wasif
    Bohlin, Markus
    2019 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE TESTING (AITEST), 2019, : 19 - 26
  • [29] Retrieval of Semantically Similar Philippine Supreme Court Case Decisions using Doc2Vec
    Barco Ranera, Lorenz Timothy
    Solano, Geoffrey A.
    Oco, Nathaniel
    2019 INTERNATIONAL SYMPOSIUM ON MULTIMEDIA AND COMMUNICATION TECHNOLOGY (ISMAC), 2019,
  • [30] 利用Doc2Vec判断中文专利相似性
    张海超
    赵良伟
    情报工程, 2018, 4 (02) : 64 - 72