Topic recommendation using Doc2Vec

被引:0
|
作者
Karvelis, Petros [1 ]
Gavrilis, Dimitris [2 ]
Georgoulas, George [3 ]
Stylios, Chrysostomos [1 ]
机构
[1] Technol Educ Inst Epirus, Lab Knowledge & Intelligent Comp, Dept Comp Engn, Arta, Greece
[2] Univ Patras, Dept Elect Engn & Comp Technol, Patras, Greece
[3] Lulea Univ Technol, Control Engn Grp, Dept Comp Sci Elect & Space Engn, Lulea, Sweden
关键词
Recommender system; multilabel classification; word2vec; doc2vec; bag of words;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ever-increasing number of electronic content stored in digital libraries requires a significant amount of effort in cataloguing and has led to self-deposit solutions where the authors submit and publish their own digital records. Even in self-deposit, going through the abstract and assigning subject terms or keywords is a time consuming and expensive process, yet crucial for the metadata quality of the record that affects retrieval. Therefore, an automatic, or even a semi-automatic process that can recommend topics for a new entry is of huge practical value. A system that can address that has to rely basically on two components, one component for efficiently representing the relevant information of the new document and one component for recommending an appropriate set of topics based on the representation of the previous stage. In this work, different candidate solutions for both components are investigated and compared. For the first stage both distributed Document to Vector (doc2vec) and conventional Bag of Words (BoW) components are employed, while for the latter two different transformation approaches from the field of multi-label classification are compared. For the comparison, a collection of Ph.D. abstracts (similar to 19000 documents) from the MIT Libraries Dspace repository is used suggesting that different combinations can provide high quality solutions.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Specialists, Scientists, and Sentiments: Word2Vec and Doc2Vec in Analysis of Scientific and Medical Texts
    Chen Q.
    Sokolova M.
    SN Computer Science, 2021, 2 (5)
  • [32] SAO2Vec: Development of an algorithm for embedding the subject-action-object (SAO) structure using Doc2Vec
    Kim, Sunhye
    Park, Inchae
    Yoon, Byungun
    PLOS ONE, 2020, 15 (02):
  • [33] Key word extraction for short text via word2vec, doc2vec, and textrank
    Li, Jun
    Huang, Guimin
    Fan, Chunli
    Sun, Zhenglin
    Zhu, Hongtao
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2019, 27 (03) : 1794 - 1805
  • [34] Sentiment analysis via Doc2Vec and Convolutional Neural Network hybrids
    Dhariyal, Bhaskar
    Ravi, Vadlamani
    Ravi, Kumar
    2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 666 - 671
  • [35] Sentiment Analysis on Twitter data with Semi-Supervised Doc2Vec
    Bilgin, Metin
    Senturk, Izzet Fatih
    2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 661 - 666
  • [36] Filtering Malicious Java']JavaScript Code with Doc2Vec on an Imbalanced Dataset
    Mimura, Mamoru
    Suga, Yuya
    2019 14TH ASIA JOINT CONFERENCE ON INFORMATION SECURITY (ASIAJCIS 2019), 2019, : 24 - 31
  • [37] A doc2vec and local outlier factor approach to measuring the novelty of patents
    Jeon, Daeseong
    Ahn, Joon Mo
    Kim, Juram
    Lee, Changyong
    TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2022, 174
  • [38] Web services classification via combining Doc2Vec and LINE model
    Ye, Hongfan
    Cao, Buqing
    Geng, Jinkun
    Wen, Yiping
    INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2020, 23 (03) : 250 - 261
  • [39] A detection method for phishing web page using DOM-based Doc2Vec model
    Feng J.
    Zhang Y.
    Qiao Y.
    Journal of Computing and Information Technology, 2020, 28 (01) : 19 - 31
  • [40] Semi-supervised Turkish Text Categorization with Word2Vec, Doc2Vec and FastText Algorithms
    Erdinc, Hakki Yagiz
    Guran, Aysun
    2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,