Domain-independent automatic keyphrase indexing with small training sets

被引:55
|
作者
Medelyan, Ena [1 ]
Witten, Ian H. [1 ]
机构
[1] Univ Waikato, Dept Comp Sci, Hamilton 3240, New Zealand
关键词
D O I
10.1002/asi.20790
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Keyphrases are widely used in both physical and digital libraries as a brief, but precise, summary of documents. They help organize material based on content, provide thematic access, represent search results, and assist with navigation. Manual assignment is expensive because trained human indexers must reach an understanding of the document and select appropriate descriptors according to defined cataloging rules. We propose a new method that enhances automatic keyphrase extraction by using semantic information about terms and phrases gleaned from a domain-specific thesaurus. The key advantage of the new approach is that it performs well with very little training data. We evaluate it on a large set of manually indexed documents in the domain of agriculture, compare its consistency with a group of six professional indexers, and explore its performance on smaller collections of documents in other domains and of French and Spanish documents.
引用
收藏
页码:1026 / 1040
页数:15
相关论文
共 50 条
  • [31] A domain-independent approach to finding related entities
    Vechtomova, Olga
    Robertson, Stephen E.
    INFORMATION PROCESSING & MANAGEMENT, 2012, 48 (04) : 654 - 670
  • [32] Domain-independent queries on databases with external functions
    Suciu, D
    DATABASE THEORY - ICDT '95, 1995, 893 : 177 - 190
  • [33] Domain-independent ontologies for cooperative information agents
    Gomez, M
    Abasolo, C
    Plaza, E
    COOPERATIVE INFORMATION AGENTS V, PROCEEDINGS, 2001, 2182 : 118 - 129
  • [34] Identifying the Sentiment in Domain-independent Chinese Sentences
    Si Mengwei
    Su Mingche
    Wang Jiayu
    HIGH PERFORMANCE STRUCTURES AND MATERIALS ENGINEERING, PTS 1 AND 2, 2011, 217-218 : 808 - +
  • [35] An architecture for domain-independent collaborative virtual environments
    BinSubaih, A
    Maddock, S
    Romano, D
    GAME-ON 2004: 5TH INTERNATIONAL CONFERENCE ON INTELLIGENT GAMES AND SIMULATION, 2004, : 84 - 88
  • [36] Inferring state constraints for domain-independent planning
    Gerevini, A
    Schubert, L
    FIFTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-98) AND TENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICAL INTELLIGENCE (IAAI-98) - PROCEEDINGS, 1998, : 905 - 912
  • [37] Domain-independent online planning for STRIPS domains
    Sapena, O
    Onaindía, E
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2002, PROCEEDINGS, 2002, 2527 : 825 - 834
  • [38] NEZHA: Efficient Domain-Independent Differential Testing
    Petsios, Theofilos
    Tang, Adrian
    Stolfo, Salvatore
    Keromytis, Angelos D.
    Jana, Suman
    2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, : 615 - 632
  • [39] A Domain-Independent Model for Identifying Security Requirements
    Munaiah, Nuthan
    Meneely, Andrew
    Murukannaiah, Pradeep K.
    2017 IEEE 25TH INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE (RE), 2017, : 506 - 511
  • [40] Identifying domain-independent normative indirect conflicts
    dos Santos, Jessica Soares
    da Silva, Viviane Torres
    2016 IEEE 28TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2016), 2016, : 536 - 543