RE-STORE: A system for compressing, browsing, and searching large documents

被引:9
|
作者
Moffat, A [1 ]
Wan, R [1 ]
机构
[1] Univ Melbourne, Dept Comp Sci & Software Engn, Parkville, Vic 3010, Australia
关键词
D O I
10.1109/SPIRE.2001.989752
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We describe a software system for managing text files of up to several hundred megabytes that combines a number of useful facilities. First, the text is stored compressed using a variant of the RE-PAIR mechanism described by Larsson and Moffat, with space savings comparable to those obtained by other widely used general-purpose compression systems. Second, we provide, as a byproduct of the compression process, a phrase-based browsing tool that allows users to explore the contents of the source text in a natural and useful manner. And third, once a set of desired phrases has been determined through the use of the browsing tool, the compressed text can be searched to determine locations at which those phrases appear, without decompressing the whole of the stored text, and without use of an additional index. That is, we show how the RE-PAIR compression regime can be extended to allow phrase-based browsing and fast interactive searching.
引用
收藏
页码:162 / 174
页数:13
相关论文
共 50 条
  • [21] An automatic fire searching and suppression system for large spaces
    Chen, T
    Yuan, HY
    Su, GF
    Fan, WC
    FIRE SAFETY JOURNAL, 2004, 39 (04) : 297 - 307
  • [22] INTELLIGENT CAR-SEARCHING SYSTEM FOR LARGE PARK
    Tan, Hua-Chun
    Zhang, Jie
    Ye, Xin-Chen
    Li, Hui-Ze
    Zhu, Pei
    Zhao, Qing-Hua
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 3134 - 3138
  • [23] Interactive browsing of large images on multi-projector display wall system
    Jiang, Zhongding
    Luo, Xuan
    Mao, Yandong
    Zang, Binyu
    Lin, Hai
    Bao, Hujun
    HUMAN-COMPUTER INTERACTION, PT 2, PROCEEDINGS, 2007, 4551 : 827 - +
  • [24] Supporting System for Quiz in Large Class - Automatic Keyword Extraction and Browsing Interface
    Takase, Haruhiko
    Kawanaka, Hiroharu
    Tsuruoka, Shinji
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2015, 19 (01) : 150 - 155
  • [26] LinkNet: A new approach for searching in a large peer-to-peer system
    Zhang, KL
    Wang, S
    WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005, 2005, 3399 : 241 - 246
  • [27] Smartphone Based Car-Searching System for Large Parking Lot
    Li, Junhuai
    An, Yang
    Fei, Rong
    Wang, Huaijun
    PROCEEDINGS OF THE 2016 IEEE 11TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2016, : 1994 - 1998
  • [28] LevelStore: A large scale key-value store for deduplication storage system
    Lu, Y., 1600, Asian Network for Scientific Information (12):
  • [29] The System for Efficient Indexing and Search in the Large Archives of Scanned Historical Documents
    Bulin, Martin
    Svec, Jan
    Ircing, Pavel
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 : 206 - 210
  • [30] GOOGLE IMAGE SWIRL, A LARGE-SCALE CONTENT-BASED IMAGE BROWSING SYSTEM
    Jing, Yushi
    Rowley, Henry A.
    Rosenberg, Charles
    Wang, Jingbin
    Zhao, Ming
    Covell, Michele
    2010 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2010), 2010, : 267 - 267