Dimensionality Reduction for Sentiment Analysis using Pre-processing Techniques

被引:0
|
作者
Mhatre, Mayuri [1 ]
Phondekar, Dakshata [1 ]
Kadam, Pranali [1 ]
Chawathe, Anushka [1 ]
Ghag, Kranti [1 ]
机构
[1] SAKEC, Informat Technol Dept, Bombay, Maharashtra, India
关键词
Sentiment Analysis; Pre-processing; Slangs Handling; Stopwords Removal; Lemmatization;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Sentiment analysis is the study of people's opinions, sentiments, attitudes and emotions, expressed in written language but this process is time consuming, inconsistent and costly in business context. Pre-processing the data will help to ease this difficulty. Pre-processing is the process of cleaning and preparing the text for its analysis using pre-processing techniques. The existing pre-processing techniques are Handling Expressive Lengthening, Emoticons Handling, HTML Tags Removal, Punctuations Handling, Slangs Handling, Stopwords Removal, Stemming and Lemmatization. In this paper, the effect of various pre-processing techniques and their combinations was analyzed on the dataset taken from Kaggle called Bag of Words Meets Bags of Popcorn. By taking every possible combination of pre-processing techniques, the aim was to find the one giving highest accuracy. Random Forest Classifier was used to predict sentiments as it is known to give good accuracy and the result was evaluated using 10 fold cross validation method. Accuracy increased from unprocessed data to pre-processed data. It was concluded that using pre-processing techniques gives a higher accuracy than the traditional approach i.e. no pre-processing.
引用
收藏
页码:16 / 21
页数:6
相关论文
共 50 条
  • [41] Pre-processing of MR Images for Efficient Quantitative Image Analysis using Deep Learning Techniques
    Poornachandra, S.
    Naveena, C.
    2017 INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN ELECTRONICS AND COMMUNICATION TECHNOLOGY (ICRAECT), 2017, : 191 - 195
  • [42] Pre-Processing for noise reduction in depth estimation
    Shim, Seong-O
    Malik, Aamir Saeed
    Choi, Tae-Sun
    SECOND INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING, 2010, 7546
  • [43] An effective framework for tweet level sentiment classification using recursive text pre-processing approach
    Alvi M.B.
    Mahoto N.A.
    Unar M.A.
    Shaikh M.A.
    International Journal of Advanced Computer Science and Applications, 2019, 10 (06): : 572 - 581
  • [44] An Effective Framework for Tweet Level Sentiment Classification using Recursive Text Pre-Processing Approach
    Alvi, Muhammad Bux
    Mahoto, Naeem A.
    Unar, Mukhtiar A.
    Shaikh, M. Akram
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (06) : 572 - 581
  • [45] Pre-processing using topographic mappings
    Wu, Y
    Fyfe, C
    PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON NEURAL NETWORKS AND BRAIN, VOLS 1-3, 2005, : 1881 - 1884
  • [46] Speech enhancement using pre-processing
    Singh, L
    Sridharan, S
    IEEE TENCON'97 - IEEE REGIONAL 10 ANNUAL CONFERENCE, PROCEEDINGS, VOLS 1 AND 2: SPEECH AND IMAGE TECHNOLOGIES FOR COMPUTING AND TELECOMMUNICATIONS, 1997, : 755 - 758
  • [47] Visualization Techniques on the Examination Timetabling Pre-processing Data
    Thomas, J. Joshua
    Khader, Ahamad Tajudin
    Belaton, Bahari
    PROCEEDINGS OF THE 2009 SIXTH INTERNATIONAL CONFERENCE ON COMPUTER GRAPHICS, IMAGING AND VISUALIZATION, 2009, : 454 - 458
  • [48] Pre-processing Online Financial Text for Sentiment Classification: A Natural Language Processing Approach
    Sun, Fan
    Belatreche, Ammar
    Coleman, Sonya
    McGinnity, T. M.
    Li, Yuhua
    2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR FINANCIAL ENGINEERING & ECONOMICS (CIFER), 2014, : 122 - 129
  • [49] Muzzle Point Pattern Recognition System Using Image Pre-Processing Techniques
    Kumar, Santosh
    Chandrakar, Shashank
    Panigrahi, Avinash
    Singh, Sanjay Kumar
    2017 FOURTH INTERNATIONAL CONFERENCE ON IMAGE INFORMATION PROCESSING (ICIIP), 2017, : 127 - 132
  • [50] Effective Pre-processing Methods with DTG Big Data by Using MapReduce Techniques
    Cho, Wonhee
    Choi, Eunmi
    ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2017, 421 : 389 - 395