Large Scale Sentiment Analysis with Locality Sensitive BitHash

被引:2
|
作者
Zhang, Wenhao [1 ]
Ji, Jianqiu [1 ]
Zhu, Jun [1 ]
Xu, Hua [1 ]
Zhang, Bo [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl TNLIST Lab, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China
关键词
Sentiment analysis; Locality Sensitive Hashing; Large scale;
D O I
10.1007/978-3-319-28940-3_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As social media data rapidly grows, sentiment analysis plays an increasingly more important role in classifying users' opinions, attitudes and feelings expressed in text. However, most studies have been focused on the effectiveness of sentiment analysis, while ignoring the storage efficiency when processing large-scale high-dimensional text data. In this paper, we incorporate the machine learning based sentiment analysis with our proposed Locality Sensitive One-Bit Min-Hash (BitHash) method. BitHash compresses each data sample into a compact binary hash code while preserving the pairwise similarity of the original data. The binary code can be used as a compressed and informative representation in replacement of the original data for subsequent processing, for example, it can be naturally integrated with a classifier like SVM. By using the compact hash code, the storage space is significantly reduced. Experiment on the popular open benchmark dataset shows that, as the hash code length increases, the classification accuracy of our proposed method could approach the state-of-the-art method, while our method only requires a significantly smaller storage space.
引用
收藏
页码:29 / 40
页数:12
相关论文
共 50 条
  • [1] BitHash: An efficient bitwise Locality Sensitive Hashing method with applications
    Zhang, Wenhao
    Ji, Jianqiu
    Zhu, Jun
    Li, Jianmin
    Xu, Hua
    Zhang, Bo
    KNOWLEDGE-BASED SYSTEMS, 2016, 97 : 40 - 47
  • [2] Large Scale Image Retrieval with Locality Sensitive Hashing
    Singh, Prateek
    Prasad, Shivam
    Agyeya, Osho
    PROCEEDINGS OF THE 2018 3RD INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT 2018), 2018, : 12 - 14
  • [3] A novel locality-sensitive hashing for large scale image retrieva
    Li, Junyi
    Li, Jianhua
    Ni, Bingbing
    Yan, Shuicheng
    Journal of Computational Information Systems, 2012, 8 (23): : 9611 - 9617
  • [4] Accelerating Large Scale Centroid-Based Clustering with Locality Sensitive Hashing
    McConville, Ryan
    Cao, Xin
    Liu, Weiru
    Miller, Paul
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 649 - 660
  • [5] Efficient large-scale sequence comparison by locality-sensitive hashing
    Buhler, J
    BIOINFORMATICS, 2001, 17 (05) : 419 - 428
  • [6] Large-Scale Distributed Locality-Sensitive Hashing for General Metric Data
    Silva, Eliezer
    Teixeira, Thiago
    Teodoro, George
    Valle, Eduardo
    SIMILARITY SEARCH AND APPLICATIONS, 2014, 8821 : 82 - 93
  • [7] Large-Scale Physiological Waveform Retrieval via Locality-Sensitive Hashing
    Kim, Yongwook Bryce
    O'Reilly, Una-May
    2015 37TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2015, : 5829 - 5833
  • [8] MapReduce Based Personalized Locality Sensitive Hashing for Similarity Joins on Large Scale Data
    Wang, Jingjing
    Lin, Chen
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2015, 2015
  • [9] SentiImgBank: A Large Scale Visual Repository for Image Sentiment Analysis
    Zhang, Yazhou
    He, Yu
    Chen, Rui
    Rong, Lu
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT XI, 2024, 14435 : 494 - 505
  • [10] Locality Sensitive Discriminant Analysis
    Cai, Deng
    He, Xiaofei
    Zhou, Kun
    Han, Jiawei
    Bao, Hujun
    20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 714 - 719