Large-scale structural learning and predicting via hashing approximation

被引:0
|
作者
Dandan Chen
Yingjie Tian
机构
[1] University of Chinese Academy of Sciences,School of Mathematical Sciences
[2] Chinese Academy of Sciences,Research Center on Fictitious Economy and Data Science, Key Laboratory of Big Data Mining and Knowledge Management
来源
关键词
Nonparallel support vector machine; Structural information; Locality-sensitive hashing; Minwise hashing;
D O I
暂无
中图分类号
学科分类号
摘要
By combining the structural information with nonparallel support vector machine, structural nonparallel support vector machine (SNPSVM) can fully exploit prior knowledge to directly improve the algorithm’s generalization capacity. However, the scalability issue how to train SNPSVM efficiently on data with huge dimensions has not been studied. In this paper, we integrate linear SNPSVM with b-bit minwise hashing scheme to speedup the training phase for large-scale and high-dimensional statistical learning, and then we address the problem of speeding-up its prediction phase via locality-sensitive hashing. For one-against-one multi-class classification problems, a two-stage strategy is put forward: a series of hash-based classifiers are built in order to approximate the exact results and filter the hypothesis space in the first stage and then the classification can be refined by solving a multi-class SNPSVM on the remaining classes in the second stage. The proposed method can deal with large-scale classification problems with a huge number of features. Experimental results on two large-scale datasets (i.e., news20 and webspam) demonstrate the efficiency of structural learning via b-bit minwise hashing. Experimental results on the ImageNet-BOF dataset, and several large-scale UCI datasets show that the proposed hash-based prediction can be more than two orders of magnitude faster than the exact classifier with minor losses in quality.
引用
收藏
页码:2889 / 2903
页数:14
相关论文
共 50 条
  • [21] Large-scale high-dimensional indexing by sparse hashing with l0 approximation
    Pedro Borges
    André Mourão
    João Magalhães
    Multimedia Tools and Applications, 2017, 76 : 24389 - 24412
  • [22] Large-Scale Support Vector Learning with Structural Kernels
    Severyn, Aliaksei
    Moschitti, Alessandro
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT III, 2010, 6323 : 229 - 244
  • [23] Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing
    Ding, Guiguang
    Guo, Yuchen
    Zhou, Jile
    Gao, Yue
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (11) : 5427 - 5440
  • [24] Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval
    Wu, Gengshen
    Han, Jungong
    Guo, Yuchen
    Liu, Li
    Ding, Guiguang
    Ni, Qiang
    Shao, Ling
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 1993 - 2007
  • [25] Unsupervised Deep Hashing for Large-scale Visual Search
    Xia, Zhaoqiang
    Feng, Xiaoyi
    Peng, Jinye
    Hadid, Abdenour
    2016 SIXTH INTERNATIONAL CONFERENCE ON IMAGE PROCESSING THEORY, TOOLS AND APPLICATIONS (IPTA), 2016,
  • [26] Large-scale image retrieval with Sparse Embedded Hashing
    Ding, Guiguang
    Zhou, Jile
    Guo, Yuchen
    Lin, Zijia
    Zhao, Sicheng
    Han, Jungong
    NEUROCOMPUTING, 2017, 257 : 24 - 36
  • [27] Large-scale image retrieval with supervised sparse hashing
    Xu, Yan
    Shen, Fumin
    Xu, Xing
    Gao, Lianli
    Wang, Yuan
    Tan, Xiao
    NEUROCOMPUTING, 2017, 229 : 45 - 53
  • [28] LASH: Large-Scale Academic Deep Semantic Hashing
    Guo, Jia-Nan
    Mao, Xian-Ling
    Lan, Tian
    Tu, Rong-Xin
    Wei, Wei
    Huang, Heyan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (02) : 1734 - 1746
  • [29] A Compression Hashing Scheme for Large-scale Face Retrieval
    Li, Jiayong
    Ng, Wing W. Y.
    Tian, Xing
    2018 8TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST 2018), 2018, : 245 - 251
  • [30] Hashing Based Prediction for Large-Scale Kernel Machine
    Lu, Lijing
    Yin, Rong
    Liu, Yong
    Wang, Weiping
    COMPUTATIONAL SCIENCE - ICCS 2020, PT II, 2020, 12138 : 496 - 509