Two-Stage Hashing for Fast Document Retrieval

被引:0
|
作者
Li, Hao [1 ]
Liu, Wei [2 ]
Ji, Heng [1 ]
机构
[1] Rensselaer Polytech Inst, Dept Comp Sci, Troy, NY 12180 USA
[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This work fulfills sublinear time Nearest Neighbor Search (NNS) in massive-scale document collections. The primary contribution is to propose a two-stage unsupervised hashing framework which harmoniously integrates two state-of-the-art hashing algorithms Locality Sensitive Hashing (LSH) and Iterative Quantization (ITQ). LSH accounts for neighbor candidate pruning, while ITQ provides an efficient and effective reranking over the neighbor pool captured by LSH. Furthermore, the proposed hashing framework capitalizes on both term and topic similarity among documents, leading to precise document retrieval. The experimental results convincingly show that our hashing based document retrieval approach well approximates the conventional Information Retrieval (IR) method in terms of retrieving semantically similar documents, and meanwhile achieves a speedup of over one order of magnitude in query time.
引用
收藏
页码:495 / 500
页数:6
相关论文
共 50 条
  • [31] Deep Supervised Hashing for Fast Image Retrieval
    Haomiao Liu
    Ruiping Wang
    Shiguang Shan
    Xilin Chen
    International Journal of Computer Vision, 2019, 127 : 1217 - 1234
  • [32] Two-stage Skew Correction of Handwritten Bangla Document Images
    Malakar, Samir
    Seraogi, Bhagesh
    Sarkar, Ram
    Das, Nibaran
    Basu, Subhadip
    Nasipuri, Mita
    2012 THIRD INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2012, : 303 - 306
  • [33] Fast Deep Asymmetric Hashing for Image Retrieval
    Lin, Chuangquan
    Lai, Zhihui
    Lu, Jianglin
    Zhou, Jie
    PATTERN RECOGNITION, ACPR 2021, PT II, 2022, 13189 : 411 - 420
  • [34] Unsupervised Triplet Hashing for Fast Image Retrieval
    Huang, Shanshan
    Xiong, Yichao
    Zhang, Ya
    Wang, Jia
    PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, : 84 - 92
  • [35] Deep Supervised Hashing for Fast Image Retrieval
    Liu, Haomiao
    Wang, Ruiping
    Shan, Shiguang
    Chen, Xilin
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (09) : 1217 - 1234
  • [36] Two-stage Vocabulary-free Spoken Document Retrieval - Subword Identification and Re-recognition of the Identified Sections -
    Itoh, Yoshiaki
    Otake, Takayuki
    Iwata, Kohei
    Kojima, Kazunori
    Ishigame, Masaaki
    Tanaka, Kazuyo
    Lee, Shi-wook
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1161 - +
  • [37] A novel two-stage optimized model for logo-based document image retrieval based on a soft computing framework
    Raveendra, K.
    Karthikeyan, T.
    Rajendran, Vinothkanna
    Reddy, P. V. N.
    SOFT COMPUTING, 2021, 25 (02) : 963 - 972
  • [38] A Two-Stage Triplet Network Training Framework for Image Retrieval
    Min, Weiqing
    Mei, Shuhuan
    Li, Zhuo
    Jiang, Shuqiang
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (12) : 3128 - 3138
  • [39] TWO-STAGE POOLING OF DEEP CONVOLUTIONAL FEATURES FOR IMAGE RETRIEVAL
    Zhi, Tiancheng
    Duan, Ling-Yu
    Wang, Yitong
    Huang, Tiejun
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 2465 - 2469
  • [40] A two-stage method for annotation-based image retrieval
    Kong, Wenjie
    Zhang, Huaxiang
    Liu, Li
    Kong, Wenjie, 1600, Binary Information Press (10): : 6253 - 6260