Domain-independent automatic keyphrase indexing with small training sets

被引:55
|
作者
Medelyan, Ena [1 ]
Witten, Ian H. [1 ]
机构
[1] Univ Waikato, Dept Comp Sci, Hamilton 3240, New Zealand
关键词
D O I
10.1002/asi.20790
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Keyphrases are widely used in both physical and digital libraries as a brief, but precise, summary of documents. They help organize material based on content, provide thematic access, represent search results, and assist with navigation. Manual assignment is expensive because trained human indexers must reach an understanding of the document and select appropriate descriptors according to defined cataloging rules. We propose a new method that enhances automatic keyphrase extraction by using semantic information about terms and phrases gleaned from a domain-specific thesaurus. The key advantage of the new approach is that it performs well with very little training data. We evaluate it on a large set of manually indexed documents in the domain of agriculture, compare its consistency with a group of six professional indexers, and explore its performance on smaller collections of documents in other domains and of French and Spanish documents.
引用
收藏
页码:1026 / 1040
页数:15
相关论文
共 50 条
  • [21] A Domain-Independent Algorithm for Plan Adaptation
    Hanks, Steve
    Weld, Daniel S.
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1994, 2 : 319 - 360
  • [22] Examining the canvas as a domain-independent artifact
    Pedro Antunes
    Mary Tate
    Information Systems and e-Business Management, 2022, 20 : 495 - 514
  • [23] Examining the canvas as a domain-independent artifact
    Antunes, Pedro
    Tate, Mary
    INFORMATION SYSTEMS AND E-BUSINESS MANAGEMENT, 2022, 20 (03) : 495 - 514
  • [24] On the predictability of domain-independent temporal planners
    Cenamor, Isabel
    Vallati, Mauro
    Chrpa, Lukas
    COMPUTATIONAL INTELLIGENCE, 2019, 35 (04) : 745 - 773
  • [25] TOWARD DOMAIN-INDEPENDENT STRATEGIES FOR ABDUCTION
    DASIGI, V
    IEEE EXPERT-INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1991, 6 (06): : 68 - 69
  • [26] Domain-independent approach to risk reduction
    Todinov, Michael
    JOURNAL OF RISK RESEARCH, 2020, 23 (06) : 796 - 810
  • [27] The challenge of domain-independent speech understanding
    Moore, RC
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 1045 - 1048
  • [28] Domain-Independent Dominance of Adaptive Methods
    Savarese, Pedro
    McAllester, David
    Babu, Sudarshan
    Maire, Michael
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16281 - 16290
  • [29] A New Domain Independent Keyphrase Extraction System
    Pudota, Nirmala
    Dattolo, Antonina
    Baruzzo, Andrea
    Tasso, Carlo
    DIGITAL LIBRARIES, 2010, 91 : 67 - 78
  • [30] Deep Learning of Heuristics for Domain-independent Planning
    Trunda, Otakar
    Bartak, Roman
    ICAART: PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2020, : 79 - 88