Improved self-indexing inverted files for full-text retrieval

被引:0
|
作者
College of Compute Science, South-Central University for Nationalities, Wuhan 430074, China [1 ]
不详 [2 ]
机构
来源
J. Comput. Inf. Syst. | 2009年 / 2卷 / 1017-1024期
关键词
Indexing (of information) - Information retrieval;
D O I
暂无
中图分类号
学科分类号
摘要
Self-index is a promising way to improve retrieval time-and-space efficiency by compression index files. An improved inverted file self-index called IFSI is proposed for full-text information retrieval. IFSI includes two level indexes: the first level index which contains a subset of the documents that are likely to be returned as top results; and the second level index which includes the surplus documents. IFSI can create a skipped index on each compressed posting list with very little or no storage overhead with efficient coding scheme. IFSI also supports efficient incremental updates with allocating free space efficiently at the tail of post lists based on statistics-based approach. Detailed simulation results and comparison with other schemes prove that the proposed IFSI can not only greatly reduce decompress time, but also simultaneously allow extremely fast query processing. © 2009 Binary Information Press March, 2009.
引用
收藏
相关论文
共 50 条
  • [1] Self-indexing inverted files for fast text retrieval
    Moffat, A
    Zobel, J
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1996, 14 (04) : 349 - 379
  • [2] Automated indexing for full-text information retrieval
    Berrios, DC
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2000, : 71 - 75
  • [3] A novel full-text indexing model for Chinese text retrieval
    Zhou, SG
    Hu, YF
    Hu, JT
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, 2001, 2113 : 370 - 379
  • [4] ANALYSIS OF SELF-INDEXING, DISK FILES
    WATERS, SJ
    COMPUTER JOURNAL, 1975, 18 (03): : 200 - 205
  • [5] A COMPARISON OF INDEXING AND FULL-TEXT FOR THE RETRIEVAL OF CLINICAL MEDICAL LITERATURE
    SIEVERT, M
    MCKININ, EJ
    SLOUGH, M
    PROCEEDINGS OF THE ASIS ANNUAL MEETING, 1988, 25 : 143 - 146
  • [6] Using Syllables As Indexing Terms in Full-Text Information Retrieval
    Kettunen, Kimmo
    Mcnamee, Paul
    Baskaya, Feza
    HUMAN LANGUAGE TECHNOLOGIES - THE BALTIC PERSPECTIVE, 2010, 219 : 225 - 232
  • [7] An Improved Indexing Algorithm for Chinese Abbreviations and Proper Nouns Retrieval of Full-text Search Engine
    Xu, Tiansheng
    Zhang, Yiming
    Qin, Aiming
    2015 3RD INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL SCIENCE, HUMANITIES, AND MANAGEMENT, ASSHM 2015, 2015, : 1610 - 1614
  • [8] Improved compressed indexes for full-text document retrieval
    Belazzougui, Djamal
    Navarro, Gonzalo
    Valenzuela, Daniel
    JOURNAL OF DISCRETE ALGORITHMS, 2013, 18 : 3 - 13
  • [9] Inverted files versus signature files for text indexing
    Zobel, J
    Moffat, A
    Ramamohanarao, K
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 1998, 23 (04): : 453 - 490
  • [10] An efficient synchronous indexing technique for full-text retrieval in distributed databases
    Hassen, Fadoua
    Amel, Grissa Touzi
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS, 2017, 112 : 811 - 821