Towards unsupervised keyphrase extraction via an autoregressive approach

被引:1
|
作者
Li, Tuohang [1 ]
Hu, Liang [1 ]
Li, Hongtu [1 ]
Sun, Chengyu [1 ]
Li, Shuai [1 ]
Chi, Ling [1 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, 2699 Qianjin St, Changchun 130012, Jilin, Peoples R China
基金
中国国家自然科学基金;
关键词
Keyphrase extraction; Autoregressive structure; Optimizer; Unsupervised model; Coverage decay optimizer;
D O I
10.1016/j.knosys.2023.110664
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Keyphrase extraction is a technique used to capture the core information of documents and is an upstream task for advanced information retrieval systems, particularly in the academic realm. Current unsupervised methods are primarily built on a score-and-rank framework with a consistent inability to acquire mutual information between extracted keyphrases, especially with graph-based models. Utilizing the autoregressive structure that is typically used in sequence-to-sequence text generation models, we propose a plug-and-play optimizer named C-Decay that can be integrated into any graph -based unsupervised keyphrase extraction model for a stable performance boost, and that mitigates the bias of certain semantically or lexically dominant tokens by optimizing the origin score distribution output by graph-based models directly. The architecture of C-Decay includes the keyphrase pool, the gain vector and the decay factor, where the keyphrase pool is designed to realize an autoregressive structure and the gain vector and the decay factor are the optimization operator. Herein, we examine three graph-based models integrated with C-Decay, and the experiment is conducted on four datasets KDD, Semeval, Nguyen, and Krapivin. Moreover, we prove that C-Decay can improve accuracy and F-Measure by an average of approximately 50% and 20%, respectively.& COPY; 2023 Elsevier B.V. All rights reserved.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] TOP-Rank: A Novel Unsupervised Approach for Topic Prediction Using Keyphrase Extraction for Urdu Documents
    Amin, Ahmad
    Rana, Toqir A.
    Mian, Natash Ali
    Iqbal, Muhammad Waseem
    Khalid, Abbas
    Alyas, Tahir
    Tubishat, Mohammad
    IEEE ACCESS, 2020, 8 (08): : 212675 - 212686
  • [42] KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data
    Muhammad Aman
    Said Jadid Abdulkadir
    Izzatdin Abdul Aziz
    Hitham Alhussian
    Israr Ullah
    Multimedia Tools and Applications, 2021, 80 : 12469 - 12506
  • [43] KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data
    Aman, Muhammad
    Abdulkadir, Said Jadid
    Aziz, Izzatdin Abdul
    Alhussian, Hitham
    Ullah, Israr
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (08) : 12469 - 12506
  • [44] Keyphrases Concentrated Area Identification from Academic Articles as Feature of Keyphrase Extraction: A New Unsupervised Approach
    Miah, Mohammad Badrul Alam
    Awang, Suryanti
    Azad, Md Saiful
    Rahman, Md Mustafizur
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (01) : 788 - 796
  • [45] Unsupervised Topic-Oriented Keyphrase Extraction and Its Application to Croatian
    Saratlija, Josip
    Snajder, Jan
    Basic, Bojana Dalbelo
    TEXT, SPEECH AND DIALOGUE, TSD 2011, 2011, 6836 : 340 - 347
  • [46] A Two-Level Keyphrase Extraction Approach
    Ali, Chedi Bechikh
    Wang, Rui
    Haddad, Hatem
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT II, 2015, 9042 : 390 - 401
  • [47] A LDA-based approach to keyphrase extraction
    Department of Automation, University of Science and Technology of China, Hefei
    230026, China
    不详
    230031, China
    Zhongnan Daxue Xuebao (Ziran Kexue Ban), 6 (2142-2148):
  • [48] A SUPERVISED LEARNING APPROACH FOR AUTOMATIC KEYPHRASE EXTRACTION
    Abulaish, Muhammad
    Anwar, Tarique
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2012, 8 (11): : 7579 - 7601
  • [49] PromptORE - A Novel Approach Towards Fully Unsupervised Relation Extraction
    Genest, Pierre-Yves
    Portier, Pierre-Edouard
    Egyed-Zsigmond, Elod
    Goix, Laurent-Walter
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 561 - 571
  • [50] Unsupervised Deep Keyphrase Generation
    Shen, Xianjie
    Wang, Yinghan
    Meng, Rui
    Shang, Jingbo
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11303 - 11311