Efficient Dimensionality Reduction for Sparse Binary Data

被引:0
|
作者
Pratap, Rameshwar
Kulkarni, Raghav [1 ]
Sohony, Ishan [2 ]
机构
[1] CMI, Chennai, Tamil Nadu, India
[2] SUNY Stony Brook, Stony Brook, NY 11794 USA
来源
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2018年
关键词
Dimensionality Reduction; Sketching; Binary Data; Similarity Search; Locality Sensitive Hashing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a dimensionality reduction (sketching) algorithm for high dimensional, sparse, binary data. Our proposed algorithm provides a single sketch which simultaneously preserves multiple similarity measures including Hamming distance, Inner product, and Jaccard Similarity [12]. In contrast to the "local projection" strategy used by most of the earlier algorithms [6], [4], [7], our approach exploits sparsity and combines the following two strategies: 1. partitioning the dimensions into several buckets, 2. obtaining " global linear summaries" within those buckets. Our algorithm is faster than the existing state-of-the-art, and it preserves the binary format of the data after the dimensionality reduction, which makes the sketch space efficient. Our algorithm can also be easily adapted in streaming and incremental learning frameworks. We give a rigorous theoretical analysis of the dimensionality reduction bounds and complement it with extensive experiments. Our proposed algorithm is simple and easy to implement in practice.
引用
收藏
页码:152 / 157
页数:6
相关论文
共 50 条
  • [21] Robust jointly sparse embedding for dimensionality reduction
    Lai, Zhihui
    Chen, Yudong
    Mo, Dongmei
    Wen, Jiajun
    Kong, Heng
    NEUROCOMPUTING, 2018, 314 : 30 - 38
  • [22] Dimensionality reduction via kernel sparse representation
    Zhisong Pan
    Zhantao Deng
    Yibing Wang
    Yanyan Zhang
    Frontiers of Computer Science, 2014, 8 : 807 - 815
  • [23] Group sparsity in dimensionality reduction of sparse representation
    Liu, Yang
    Li, Xueming
    Liu, Chenyu
    Tang, Yufang
    2014 INTERNATIONAL SYMPOSIUM ON WIRELESS PERSONAL MULTIMEDIA COMMUNICATIONS (WPMC), 2014, : 541 - 546
  • [24] An Explicit Sparse Mapping for Nonlinear Dimensionality Reduction
    Xia, Ying
    Lu, Qiang
    Feng, JiangFan
    Bae, Hae-Young
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, RSKT 2014, 2014, 8818 : 149 - 157
  • [25] Dimensionality reduction via kernel sparse representation
    Pan, Zhisong
    Deng, Zhantao
    Wang, Yibing
    Zhang, Yanyan
    FRONTIERS OF COMPUTER SCIENCE, 2014, 8 (05) : 807 - 815
  • [26] Neural correlates of sparse coding and dimensionality reduction
    Beyeler, Michael
    Rounds, Emily L.
    Carlson, Kristofor D.
    Dutt, Nikil
    Krichmar, Jeffrey L.
    PLOS COMPUTATIONAL BIOLOGY, 2019, 15 (06)
  • [27] Sparse Dimensionality Reduction Based on Compressed Sensing
    Tang, Yufang
    Li, Xueming
    Liu, Yan
    Wang, Jizhe
    Xu, Yan
    2014 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2014, : 3373 - 3378
  • [28] Compressive Sensing of Jointly Sparse Signals as a Method for Dimensionality reduction of Mass Spectrometry Data
    Awedat, Khalfalla
    Alajmi, Masoud
    Springstead, James R.
    2017 IEEE INTERNATIONAL CONFERENCE ON ELECTRO INFORMATION TECHNOLOGY (EIT), 2017, : 80 - 85
  • [29] Dimensionality reduction of hyperspectral data based on non-negative sparse embedding projection
    School of Information and Electrical Engineering, China University of Mining and Technology, Xuzhou, Jiangsu 221116, China
    Zhongguo Kuangye Daxue Xuebao, 6 (1010-1017):
  • [30] A sparse grid based method for generative dimensionality reduction of high-dimensional data
    Bohn, Bastian
    Garcke, Jochen
    Griebel, Michael
    JOURNAL OF COMPUTATIONAL PHYSICS, 2016, 309 : 1 - 17