Large Scale Sentiment Analysis with Locality Sensitive BitHash

被引:2
|
作者
Zhang, Wenhao [1 ]
Ji, Jianqiu [1 ]
Zhu, Jun [1 ]
Xu, Hua [1 ]
Zhang, Bo [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl TNLIST Lab, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China
关键词
Sentiment analysis; Locality Sensitive Hashing; Large scale;
D O I
10.1007/978-3-319-28940-3_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As social media data rapidly grows, sentiment analysis plays an increasingly more important role in classifying users' opinions, attitudes and feelings expressed in text. However, most studies have been focused on the effectiveness of sentiment analysis, while ignoring the storage efficiency when processing large-scale high-dimensional text data. In this paper, we incorporate the machine learning based sentiment analysis with our proposed Locality Sensitive One-Bit Min-Hash (BitHash) method. BitHash compresses each data sample into a compact binary hash code while preserving the pairwise similarity of the original data. The binary code can be used as a compressed and informative representation in replacement of the original data for subsequent processing, for example, it can be naturally integrated with a classifier like SVM. By using the compact hash code, the storage space is significantly reduced. Experiment on the popular open benchmark dataset shows that, as the hash code length increases, the classification accuracy of our proposed method could approach the state-of-the-art method, while our method only requires a significantly smaller storage space.
引用
收藏
页码:29 / 40
页数:12
相关论文
共 50 条
  • [31] A method using locality-sensitive hashing for large-scale content-based image retrieval
    Wang Weihong
    Wang Song
    CCDC 2009: 21ST CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-6, PROCEEDINGS, 2009, : 1816 - 1820
  • [32] DISTRIBUTED LOCALITY AND LARGE-SCALE NEUROCOGNITIVE NETWORKS
    MESULAM, MM
    BEHAVIORAL AND BRAIN SCIENCES, 1994, 17 (01) : 74 - 76
  • [33] Locality Sensitive Discriminant Analysis for Speaker Verification
    Cai, Danwei
    Cai, Weicheng
    Ni, Zhidong
    Li, Ming
    2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2016,
  • [34] A large-scale sentiment analysis of tweets pertaining to the 2020 US presidential election
    Ali, Rao Hamza
    Pinto, Gabriela
    Lawrie, Evelyn
    Linstead, Erik J.
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [35] Large-Scale Joint Topic, Sentiment & User Preference Analysis for Online Reviews
    Yu, Xinli
    Chen, Zheng
    Yang, Wei-Shih
    Hu, Xiaohua
    Yan, Erjia
    Li, Guangrong
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 847 - 856
  • [36] Sentiment Analysis by Exploring Large Scale Web-based Chinese Short Text
    Liu, Ziyu
    Qi, Yonggang
    Ma, Zhanyu
    Yang, Jie
    INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE), 2017, 190 : 930 - 939
  • [37] Distantly Supervised Lifelong Learning for Large-Scale Social Media Sentiment Analysis
    Xia, Rui
    Jiang, Jie
    He, Huihui
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2017, 8 (04) : 480 - 491
  • [38] BANGLABOOK: A Large-scale Bangla Dataset for Sentiment Analysis from Book Reviews
    Kabir, Mohsinul
    Bin Mahfuz, Obayed
    Raiyan, Syed Rifat
    Mahmud, Hasan
    Hasan, Md Kamrul
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1237 - 1247
  • [39] A novel approach to generate a large scale of supervised data for short text sentiment analysis
    Sun, Xiao
    He, Jiajin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (9-10) : 5439 - 5459
  • [40] A novel approach to generate a large scale of supervised data for short text sentiment analysis
    Xiao Sun
    Jiajin He
    Multimedia Tools and Applications, 2020, 79 : 5439 - 5459