Design and Implementation of Word2Vec Parallel Algorithm Based on HPC

被引:0
|
作者
Yi, Xianyong [1 ]
Zheng, Rongge [1 ]
Wang, Aoyu [1 ]
Qin, Hao [1 ]
Chen, Yufeng [1 ]
机构
[1] Shandong Univ, Sch Mech Elect & Informat Engn, Weihai, Weihai, Peoples R China
关键词
HPC; Word2Vec; Parallel Algorithm; Natural Language Processing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Word2Vec, (Word to Vector) processes natural language by calculating the cosine similarity. However, the serial algorithm of original Word2Vec fails to satisfy the demands of training of corpus text because of the explosive growth of information. It has become the bottleneck owing to its comparatively low processing efficiency. The High Performance Computing (HPC) specializes in improving the calculation efficiency; therefore, the training efficiency of corpus texts can be greatly improved by parallelizing Word2Vec algorithm. After analyzing the characteristics of the Word2Vec algorithm in detail, we design and implement a parallel Word2Vec algorithm and use it to train corpus text on HPC. Furthermore, the corpus texts of different sizes are collected and trained, and the speed-up ratio is calculated by using the serial algorithm and parallel algorithm of Word2Vec, respectively. The experimental results show that there is a higher speed-up ratio when using the Word2Vec parallel algorithm running on HPC.
引用
收藏
页码:585 / 590
页数:6
相关论文
共 50 条
  • [41] Considerations about learning Word2Vec
    Di Gennaro, Giovanni
    Buonanno, Amedeo
    Palmieri, Francesco A. N.
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (11): : 12320 - 12335
  • [42] Acceleration of Word2vec Using GPUs
    Bae, Seulki
    Yi, Youngmin
    NEURAL INFORMATION PROCESSING, ICONIP 2016, PT II, 2016, 9948 : 269 - 279
  • [43] Application of Output Embedding on Word2Vec
    Uchida, Shuto
    Yoshikawa, Tomohiro
    Furuhashi, Takeshi
    2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 1433 - 1436
  • [44] Word2vec Fuzzy Clustering Algorithm and Its Application in Credit Evaluation
    Wang, Jinsheng
    Lin, Jing
    Han, Lu
    APPLICATIONS OF DECISION SCIENCE IN MANAGEMENT, ICDSM 2022, 2023, 260 : 577 - 586
  • [45] Parallel Data-Local Training for Optimizing Word2Vec Embeddings for Word and Graph Embeddings
    Moon, Gordon E.
    Newman-Griffis, Denis
    Kim, Jinsung
    Sukumaran-Rajam, Aravind
    Fosler-Lussier, Eric
    Sadayappan, P.
    PROCEEDINGS OF 2019 5TH IEEE/ACM WORKSHOP ON MACHINE LEARNING IN HIGH PERFORMANCE COMPUTING ENVIRONMENTS (MLHPC 2019), 2019, : 44 - 55
  • [46] Research on Keyword Extraction Based on Word2Vec Weighted TextRank
    Wen, Yujun
    Yuan, Hui
    Zhang, Pengzhou
    2016 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2016, : 2109 - 2113
  • [47] Scenario-Based Microservice Retrieval Using Word2Vec
    Ma, Shang-Pin
    Chuang, Yen
    Lan, Ci-Wei
    Chen, Hsi-Min
    Huang, Chun-Ying
    Li, Chia-Yu
    2018 IEEE 15TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE 2018), 2018, : 239 - 244
  • [48] Text Classification Based on Word2vec and Convolutional Neural Network
    Li, Lin
    Xiao, Linlong
    Jin, Wenzhen
    Zhu, Hong
    Yang, Guocai
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT V, 2018, 11305 : 450 - 460
  • [49] Word2vec Based System for Recognizing Partial Textual Entailment
    Vita, Martin
    Kriz, Vincent
    PROCEEDINGS OF THE 2016 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2016, 8 : 513 - 516
  • [50] Word2Vec based Spelling Correction Method of Twitter Message
    Kim, Jeongin
    Hong, Taekeun
    Kim, Pankoo
    SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 2016 - 2019