The more "similar" the happier: Augmenting text using similarity scoring with neural embeddings for happiness classification

被引：0

作者：

Kuan Shyang Yong

Jasy Suet Yan Liew

机构：

[1] Universiti Sains Malaysia,School of Computer Sciences

来源：

Journal of Intelligent Information Systems | 2023年 / 60卷

关键词：

Happiness classification; Text augmentation; Sentiment analysis; Deep learning; Similarity scoring; Distant supervision;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Measuring happiness of populations of interest via Twitter offers an alternative for social scientists to gauge the level of happiness in and across different nations but machine learning models are needed to scale happiness classification for millions of tweets. A good performing happiness classifier requires a fair amount of training data with minimal noise. Our study introduces a similarity-based text augmentation method to efficiently expand data for the emotion “happiness” from an existing emotion corpus (EmoTweet-28) by selecting the most similar positive examples from happiness tweets collected using distant supervision (DS) to be added into an augmented corpus as training data. Six neural embeddings on top of the baseline bag-of-words (BoW) representation were explored to compute the cosine similarity score between 100,000 DS tweets with 1,024 gold standard happiness tweets in EmoTweet-28 (ET). Our results show that the augmented training set obtained from USE embedding with the similarity threshold of 0.7 trained on BiLSTM produced the best model in predicting whether a tweet contains expressions of happiness or not (F1 score = 0.599). However, most augmented training sets obtained from InferSent-GloVe embedding produced BiLSTM classifiers with more consistent F1 scores above the base classifier in the fixed increment experiments. We show that our proposed text augmentation strategy can improve or maintain classification performance in small but cleaner increment sets as opposed to adding DS tweets randomly as training data.

引用

页码：631 / 653

页数：22

共 50 条

[1] The more "similar" the happier: Augmenting text using similarity scoring with neural embeddings for happiness classification
Yong, Kuan Shyang
Liew, Jasy Suet Yan
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2023, 60 (03) : 631 - 653
[2] Text classification using embeddings: a survey
Liliane Soares da Costa
Italo L. Oliveira
Renato Fileto
Knowledge and Information Systems, 2023, 65 : 2761 - 2803
[3] Text classification using embeddings: a survey
da Costa, Liliane Soares
Oliveira, Italo L.
Fileto, Renato
KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (07) : 2761 - 2803
[4] Text Classification Using Word Embeddings
Helaskar, Mukund N.
Sonawane, Sheetal S.
2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2019,
[5] Text sentiment classification of Amazon reviews using word embeddings and convolutional neural networks
Mohammed Qorich
Rajae El Ouazzani
The Journal of Supercomputing, 2023, 79 : 11029 - 11054
[6] Text sentiment classification of Amazon reviews using word embeddings and convolutional neural networks
Qorich, Mohammed
El Ouazzani, Rajae
JOURNAL OF SUPERCOMPUTING, 2023, 79 (10): : 11029 - 11054
[7] An analysis of hierarchical text classification using word embeddings
Stein, Roger Alan
Jaques, Patricia A.
Valiati, Joao Francisco
INFORMATION SCIENCES, 2019, 471 : 216 - 232
[8] A Neural Network Approach for Text Classification Using Low Dimensional Joint Embeddings of Words and Knowledge
da Costa, Liliane Soares
Oliveira, Italo Lopes
Fileto, Renato
INFORMATION INTEGRATION AND WEB INTELLIGENCE, IIWAS 2022, 2022, 13635 : 181 - 194
[9] Multilabeled Emotions Classification in Software Engineering Text Using Convolutional Neural Networks and Word Embeddings
Wagan, Atif Ali
Li, Shuaiyong
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2025, 37 (03)
[10] Automatic Text Scoring Using Neural Networks
Alikaniotis, Dimitrios
Yannakoudakis, Helen
Rei, Marek
PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 715 - 725

← 1 2 3 4 5 →