ADDING COMPRESSION TO A FULL-TEXT RETRIEVAL-SYSTEM

被引:53
|
作者
ZOBEL, J [1 ]
MOFFAT, A [1 ]
机构
[1] UNIV MELBOURNE,DEPT COMP SCI,PARKVILLE,VIC 3052,AUSTRALIA
来源
SOFTWARE-PRACTICE & EXPERIENCE | 1995年 / 25卷 / 08期
关键词
FULL-TEXT RETRIEVAL; DATA COMPRESSION; TEXT COMPRESSION; HUFFMAN CODING; WORD-BASED MODEL;
D O I
10.1002/spe.4380250804
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We describe the implementation of a data compression scheme as an integral and transparent layer within a full-text retrieval system. Using a semi-static word-based compression model, the space needed to store the text is under 30 per cent of the original requirement. The model is used in conjunction with canonical Huffman coding and together these two paradigms provide fast decompression. Experiments with 500 Mb of newspaper articles show that in full-text retrieval environments compression not only saves space, it can also yield faster query processing - a win-win situation.
引用
收藏
页码:891 / 903
页数:13
相关论文
共 50 条
  • [31] AGAIN - AN EVALUATION OF RETRIEVAL EFFECTIVENESS FOR A FULL-TEXT DOCUMENT-RETRIEVAL SYSTEM - RESPONSE
    BLAIR, DC
    MARON, ME
    COMMUNICATIONS OF THE ACM, 1986, 29 (02) : 149 - 149
  • [32] An optimized full-text retrieval system based on Lucene in Oracle database
    Shi, Xiujin
    Wang, Zhenfeng
    2014 SECOND INTERNATIONAL CONFERENCE ON ENTERPRISE SYSTEMS (ES), 2014, : 61 - 65
  • [33] TESTING OF A NATURAL-LANGUAGE RETRIEVAL-SYSTEM FOR A FULL TEXT KNOWLEDGE BASE
    BERNSTEIN, LM
    WILLIAMSON, RE
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1984, 35 (04): : 235 - 247
  • [34] A CASE-STUDY OF CACHING STRATEGIES FOR A DISTRIBUTED FULL TEXT RETRIEVAL-SYSTEM
    MARTIN, TP
    MACLEOD, IA
    RUSSELL, JI
    LEESE, K
    FOSTER, B
    INFORMATION PROCESSING & MANAGEMENT, 1990, 26 (02) : 227 - 247
  • [35] A novel full-text indexing model for Chinese text retrieval
    Zhou, SG
    Hu, YF
    Hu, JT
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, 2001, 2113 : 370 - 379
  • [36] Compression and full-text indexing for digital libraries
    Witten, IH
    Moffat, A
    Bell, TC
    DIGITAL LIBRARIES: CURRENT ISSUES, 1995, 916 : 181 - 201
  • [37] AN EVALUATION OF THE APPLICABILITY OF RANKING ALGORITHMS TO IMPROVE THE EFFECTIVENESS OF FULL-TEXT RETRIEVAL .2. ON THE EFFECTIVENESS OF RANKING ALGORITHMS ON FULL-TEXT RETRIEVAL
    RO, JS
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE, 1988, 39 (03): : 147 - 160
  • [38] The Study on Key Technology of Mongolian Full-Text Retrieval
    Loglo, S.
    Sarula
    ADVANCES IN COMPUTER SCIENCE, ENVIRONMENT, ECOINFORMATICS, AND EDUCATION, PT IV, 2011, 217 : 340 - 345
  • [39] Improved compressed indexes for full-text document retrieval
    Belazzougui, Djamal
    Navarro, Gonzalo
    Valenzuela, Daniel
    JOURNAL OF DISCRETE ALGORITHMS, 2013, 18 : 3 - 13
  • [40] Research and Implementation of Full-Text Retrieval System Using Compass Based on Lucene
    Zhang, Conghui
    Zhan, Shubo
    PROCEEDINGS OF THE 2012 INTERNATIONAL CONFERENCE ON COMMUNICATION, ELECTRONICS AND AUTOMATION ENGINEERING, 2013, 181 : 349 - 356