Large-scale structural learning and predicting via hashing approximation

被引:0
|
作者
Dandan Chen
Yingjie Tian
机构
[1] University of Chinese Academy of Sciences,School of Mathematical Sciences
[2] Chinese Academy of Sciences,Research Center on Fictitious Economy and Data Science, Key Laboratory of Big Data Mining and Knowledge Management
来源
关键词
Nonparallel support vector machine; Structural information; Locality-sensitive hashing; Minwise hashing;
D O I
暂无
中图分类号
学科分类号
摘要
By combining the structural information with nonparallel support vector machine, structural nonparallel support vector machine (SNPSVM) can fully exploit prior knowledge to directly improve the algorithm’s generalization capacity. However, the scalability issue how to train SNPSVM efficiently on data with huge dimensions has not been studied. In this paper, we integrate linear SNPSVM with b-bit minwise hashing scheme to speedup the training phase for large-scale and high-dimensional statistical learning, and then we address the problem of speeding-up its prediction phase via locality-sensitive hashing. For one-against-one multi-class classification problems, a two-stage strategy is put forward: a series of hash-based classifiers are built in order to approximate the exact results and filter the hypothesis space in the first stage and then the classification can be refined by solving a multi-class SNPSVM on the remaining classes in the second stage. The proposed method can deal with large-scale classification problems with a huge number of features. Experimental results on two large-scale datasets (i.e., news20 and webspam) demonstrate the efficiency of structural learning via b-bit minwise hashing. Experimental results on the ImageNet-BOF dataset, and several large-scale UCI datasets show that the proposed hash-based prediction can be more than two orders of magnitude faster than the exact classifier with minor losses in quality.
引用
收藏
页码:2889 / 2903
页数:14
相关论文
共 50 条
  • [41] Compact binary hashing for efficient large-scale image retrieval
    Irie, Go
    Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 2015, 69 (02): : 124 - 130
  • [42] Hashing Based Fast Palmprint Identification for Large-Scale Databases
    Yue, Feng
    Li, Bin
    Yu, Ming
    Wang, Jiaqiang
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2013, 8 (05) : 769 - 778
  • [43] An Enhanced Deep Hashing Method for Large-Scale Image Retrieval
    Chen, Cong
    Tong, Weiqin
    Ding, Xuehai
    Zhi, Xiaoli
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2019, PT I, 2019, 11775 : 382 - 393
  • [44] Low-rank approximation of large-scale matrices via randomized methods
    Sarvenaz Hatamirad
    Mir Mohsen Pedram
    The Journal of Supercomputing, 2018, 74 : 830 - 844
  • [45] Low-rank approximation of large-scale matrices via randomized methods
    Hatamirad, Sarvenaz
    Pedram, Mir Mohsen
    JOURNAL OF SUPERCOMPUTING, 2018, 74 (02): : 830 - 844
  • [46] Online Supervised Sketching Hashing for Large-Scale Image Retrieval
    Weng, Zhenyu
    Zhu, Yuesheng
    IEEE ACCESS, 2019, 7 : 88369 - 88379
  • [47] LSDH: A Hashing Approach for Large-Scale Link Prediction in Microblogs
    Liu, Dawei
    Wang, Yuanzhuo
    Jia, Yantao
    Li, Jingyuan
    Yu, Zhihua
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 3120 - 3121
  • [48] Multiple feature kernel hashing for large-scale visual search
    Liu, Xianglong
    He, Junfeng
    Lang, Bo
    PATTERN RECOGNITION, 2014, 47 (02) : 748 - 757
  • [49] Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization
    Zhang, Dongqing
    Li, Wu-Jun
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 2177 - 2183
  • [50] Large-Scale Supervised Hashing for Cross-Modal Retreival
    Karbil, Loubna
    Daoudi, Imane
    2017 IEEE/ACS 14TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2017, : 803 - 808