Semi-Supervised Hashing for Large-Scale Search

被引:615
|
作者
Wang, Jun [1 ]
Kumar, Sanjiv [2 ]
Chang, Shih-Fu [3 ]
机构
[1] IBM TJ Watson Res Ctr, Business Analyt & Math Sci Dept, Yorktown Hts, NY 10598 USA
[2] Google Res, New York, NY 10011 USA
[3] Columbia Univ, Dept Elect & Comp Engn, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
Hashing; nearest neighbor search; binary codes; semi-supervised hashing; pairwise labels; sequential hashing; SCENE;
D O I
10.1109/TPAMI.2012.48
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hashing-based approximate nearest neighbor (ANN) search in huge databases has become popular due to its computational and memory efficiency. The popular hashing methods, e. g., Locality Sensitive Hashing and Spectral Hashing, construct hash functions based on random or principal projections. The resulting hashes are either not very accurate or are inefficient. Moreover, these methods are designed for a given metric similarity. On the contrary, semantic similarity is usually given in terms of pairwise labels of samples. There exist supervised hashing methods that can handle such semantic similarity, but they are prone to overfitting when labeled data are small or noisy. In this work, we propose a semi-supervised hashing (SSH) framework that minimizes empirical error over the labeled set and an information theoretic regularizer over both labeled and unlabeled sets. Based on this framework, we present three different semi-supervised hashing methods, including orthogonal hashing, nonorthogonal hashing, and sequential hashing. Particularly, the sequential hashing method generates robust codes in which each hash function is designed to correct the errors made by the previous ones. We further show that the sequential learning paradigm can be extended to unsupervised domains where no labeled pairs are available. Extensive experiments on four large datasets (up to 80 million samples) demonstrate the superior performance of the proposed SSH methods over state-of-the-art supervised and unsupervised hashing techniques.
引用
收藏
页码:2393 / 2406
页数:14
相关论文
共 50 条
  • [1] Semi-supervised Hashing with Semantic Confidence for Large Scale Visual Search
    Pan, Yingwei
    Yao, Ting
    Li, Houqiang
    Ngo, Chong-Wah
    Mei, Tao
    SIGIR 2015: PROCEEDINGS OF THE 38TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2015, : 53 - 62
  • [2] Large-Scale Remote Sensing Image Retrieval Based on Semi-Supervised Adversarial Hashing
    Tang, Xu
    Liu, Chao
    Ma, Jingjing
    Zhang, Xiangrong
    Liu, Fang
    Jiao, Licheng
    REMOTE SENSING, 2019, 11 (17)
  • [3] Scalable Supervised Discrete Hashing for Large-Scale Search
    Luo, Xin
    Wu, Ye
    Xu, Xin-Shun
    WEB CONFERENCE 2018: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW2018), 2018, : 1603 - 1612
  • [4] SEMI-SUPERVISED GRAPH CONVOLUTIONAL HASHING NETWORK FOR LARGE-SCALE CROSS-MODAL RETRIEVAL
    Shen, Zhanjian
    Zhai, Deming
    Liu, Xianming
    Jiang, Junjun
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2366 - 2370
  • [5] queryCategorizr: A Large-Scale Semi-Supervised System for Categorization of Web Search Queries
    Grbovic, Mihajlo
    Djuric, Nemanja
    Radosavljevic, Vladan
    Bhamidipati, Narayan
    Hawker, Jordan
    Johnson, Caleb
    WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, : 199 - 202
  • [6] SSDH: Semi-Supervised Deep Hashing for Large Scale Image Retrieval
    Zhang, Jian
    Peng, Yuxin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (01) : 212 - 225
  • [7] Semi-supervised spectral hashing for fast similarity search
    Yao, Chengwei
    Bu, Jiajun
    Wu, Chenxia
    Chen, Gencai
    NEUROCOMPUTING, 2013, 101 : 52 - 58
  • [8] Semi-Supervised Metric Learning-Based Anchor Graph Hashing for Large-Scale Image Retrieval
    Hu, Haifeng
    Wang, Kun
    Lv, Chenggang
    Wu, Jiansheng
    Yang, Zhen
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (02) : 739 - 754
  • [9] Nonnegative Spectral Clustering for Large-Scale Semi-supervised Learning
    Hu, Weibo
    Chen, Chuan
    Ye, Fanghua
    Zheng, Zibin
    Ling, Guohui
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, 2019, 11448 : 287 - 291
  • [10] Transductive Centroid Projection for Semi-supervised Large-Scale Recognition
    Liu, Yu
    Song, Guanglu
    Shao, Jing
    Jin, Xiao
    Wang, Xiaogang
    COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 : 72 - 89