Twitter alloy steel disambiguation and user relevance via one-class and two-class news titles classifiers

被引:7
|
作者
Zola, Paola [1 ]
Cortez, Paulo [2 ]
Brentari, Eugenio [3 ]
机构
[1] IIT CNR, Via G Moruzzi 1, I-56124 Pisa, Italy
[2] Univ Minho, ALGORITMI Ctr, Dept Informat Syst, P-4804533 Guimaraes, Portugal
[3] Univ Brescia, Dept Econ & Management, Brescia, Italy
来源
NEURAL COMPUTING & APPLICATIONS | 2021年 / 33卷 / 04期
关键词
Text classification; User relevance; Machine learning; Social media analytics; MICROBLOGGING DATA; SENTIMENT; IMPACT;
D O I
10.1007/s00521-020-04991-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the nontrivial task of Twitter financial disambiguation (TFD), which is relevant to filter financial domain tweets (e.g., alloy steel or coffee prices) when no unique identifiers (e.g., cashtags) are adopted. To automate TFD, we propose a transfer learning approach that uses freely labeled news titles to train diverse one-class and two-class classification methods. These include different text handling transforms, adaptations of statistical measures and modern machine learning methods, including support vector machines (SVM), deep autoencoders and multilayer perceptrons. As a case study, we analyzed the domain of alloy steel prices, collecting a recent Twitter dataset. Overall, the best results were achieved by a two-class SVM fed with TFD statistical measures and topic model features, obtaining an 80% and 71% discrimination level when tested with 11,081 and 3000 manually labeled tweets. The best one-class performance (78% and 69% for the same test tweets) was obtained by a term frequency-inverse document frequency classifier (TF-IDFC). These models were further used to generate a Financial User Relevance rank (FUR) score, aiming to filter relevant users. The SVM and TF-IDFC FUR models obtained a predictive user discrimination level of 80% and 75% when tested with a manually labeled test sample of 418 users. These results confirm the proposed joint TFD-FUR approach as a valuable tool for the selection of Twitter texts and users for financial social media analytics (e.g., sentiment analysis, detection of influential users).
引用
收藏
页码:1245 / 1260
页数:16
相关论文
共 19 条
  • [1] Twitter alloy steel disambiguation and user relevance via one-class and two-class news titles classifiers
    Paola Zola
    Paulo Cortez
    Eugenio Brentari
    Neural Computing and Applications, 2021, 33 : 1245 - 1260
  • [2] Active Learning for One-Class Classification Using Two One-Class Classifiers
    Schlachter, Patrick
    Yang, Bin
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1197 - 1201
  • [3] Feature Selection and Ensemble Learning Techniques in One-Class Classifiers: An Empirical Study of Two-Class Imbalanced Datasets
    Tsai, Chih-Fong
    Lin, Wei-Chao
    IEEE ACCESS, 2021, 9 : 13717 - 13726
  • [4] Using one-class and two-class SVMs for multiclass image annotation
    Goh, KS
    Chang, EY
    Li, BT
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (10) : 1333 - 1346
  • [5] Comparison of one-class SVM and two-class SVM for fold recognition
    Senf, Alexander
    Chen, Xue-wen
    Zhang, Anne
    NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2006, 4233 : 140 - 149
  • [6] Two-Class with Oversampling Versus One-Class Classification for Microarray Datasets
    Perez-Sanchez, Beatriz
    Fontenla-Romero, Oscar
    Sanchez-Marono, Noelia
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2016, PT II, 2016, 9887 : 398 - 405
  • [7] Combining One-Class Classifiers via Meta Learning
    Menahem, Eitan
    Rokach, Lior
    Elovici, Yuval
    PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 2435 - 2440
  • [8] Improving computer-aided diagnosis of interstitial disease in chest radiographs by combining one-class and two-class classifiers.
    Arzhaeva, Yulia
    Tax, David
    van Ginneken, Bram
    MEDICAL IMAGING 2006: IMAGE PROCESSING, PTS 1-3, 2006, 6144
  • [9] A Comparison of One-class and Two-class Models for Ransomware Detection via Low-level Hardware Information
    Woralert, Chutitep
    Liu, Chen
    Blasingame, Zander
    Yang, Zhiliu
    2023 ASIAN HARDWARE ORIENTED SECURITY AND TRUST SYMPOSIUM, ASIANHOST, 2023,
  • [10] Multi-class classification via heterogeneous ensemble of one-class classifiers
    Kang, Seokho
    Cho, Sungzoon
    Rang, Pilsung
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 43 : 35 - 43