Large Scale Sentiment Analysis with Locality Sensitive BitHash

被引:2
|
作者
Zhang, Wenhao [1 ]
Ji, Jianqiu [1 ]
Zhu, Jun [1 ]
Xu, Hua [1 ]
Zhang, Bo [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl TNLIST Lab, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China
关键词
Sentiment analysis; Locality Sensitive Hashing; Large scale;
D O I
10.1007/978-3-319-28940-3_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As social media data rapidly grows, sentiment analysis plays an increasingly more important role in classifying users' opinions, attitudes and feelings expressed in text. However, most studies have been focused on the effectiveness of sentiment analysis, while ignoring the storage efficiency when processing large-scale high-dimensional text data. In this paper, we incorporate the machine learning based sentiment analysis with our proposed Locality Sensitive One-Bit Min-Hash (BitHash) method. BitHash compresses each data sample into a compact binary hash code while preserving the pairwise similarity of the original data. The binary code can be used as a compressed and informative representation in replacement of the original data for subsequent processing, for example, it can be naturally integrated with a classifier like SVM. By using the compact hash code, the storage space is significantly reduced. Experiment on the popular open benchmark dataset shows that, as the hash code length increases, the classification accuracy of our proposed method could approach the state-of-the-art method, while our method only requires a significantly smaller storage space.
引用
收藏
页码:29 / 40
页数:12
相关论文
共 50 条
  • [21] Large-Scale Distributed Learning via Private On-Device Locality-Sensitive Hashing
    Rabbani, Tahseen
    Bornstein, Marco
    Huang, Furong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [22] A novel locality-sensitive hashing algorithm for similarity searches on large-scale hyperspectral data
    Zhou, Yuan
    Liu, Chun
    Li, Nan
    Li, Minzhen
    REMOTE SENSING LETTERS, 2016, 7 (10) : 965 - 974
  • [23] Large scale and parallel sentiment analysis based on Label Propagation in Twitter Data
    Yang, Yibing
    Shafiq, M. Omair
    2018 17TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (IEEE TRUSTCOM) / 12TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (IEEE BIGDATASE), 2018, : 1791 - 1798
  • [24] Large Scale Sentiment Learning with Limited Labels
    Iosifidis, Vasileios
    Ntoutsi, Eirini
    KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, : 1823 - 1832
  • [25] A large scale group decision making system based on sentiment analysis cluster
    Trillo, Jose Ramon
    Herrera-Viedma, Enrique
    Morente-Molinera, Juan Antonio
    Cabrerizo, Francisco Javier
    INFORMATION FUSION, 2023, 91 : 633 - 643
  • [26] A Large-Scale Japanese Dataset for Aspect-based Sentiment Analysis
    Nakayama, Yuki
    Murakami, Koji
    Kumar, Gautam
    Bhingardive, Sudha
    Hardaway, Ikuko
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 7014 - 7021
  • [27] Sentiment Diffusion in Large Scale Social Networks
    Tang, Jie
    Fong, Acm
    2013 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2013, : 244 - +
  • [28] A large-scale group decision making model with a clustering algorithm based on a locality sensitive hash function
    Mu, Zhangqian
    Liu, Yuanyuan
    Yang, Youlong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 140
  • [29] Addictive homogeneous kernels map with locality-sensitive hashing for large-scale logistics image retrieval
    Liu, Xiaojun
    Li, Junyi
    Li, Jianhua
    Yan, Shuicheng
    Journal of Information and Computational Science, 2015, 12 (08): : 3083 - 3095
  • [30] Matching User Accounts across Large-scale Social Networks based on Locality-sensitive Hashing
    Li, Yongjun
    Li, Xiangyu
    Yang, Jiaqi
    Gao, Congjie
    2020 IEEE INTL SYMP ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, INTL CONF ON BIG DATA & CLOUD COMPUTING, INTL SYMP SOCIAL COMPUTING & NETWORKING, INTL CONF ON SUSTAINABLE COMPUTING & COMMUNICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2020), 2020, : 802 - 809