Research of news text classification method based on hierarchical semantics and prior correction

被引:0
|
作者
Sun, Ping [1 ]
Song, LinLin [2 ]
Yuan, Ling [2 ]
Yu, Haiping [1 ]
Wei, Yinzhen [1 ]
机构
[1] Wuhan Vocational College of Software and Engineering, Hubei, Wuhan, China
[2] School of Computer Science and Technology, Huazhong University of Science and Technology, Hubei, Wuhan, China
来源
基金
中国国家自然科学基金;
关键词
Classification (of information) - Deep learning - Learning algorithms - Learning systems - Natural language processing systems - Text processing;
D O I
10.3233/JIFS-238433
中图分类号
学科分类号
摘要
News text is an important branch of natural language processing. Compared to ordinary texts, news text has significant economic and scientific value. The characteristics of news text include structural hierarchy, diverse label categories, and limited high-quality annotation samples. Many machine learning and deep learning methods exist to analyze various forms of news text. However, due to label imbalance, hierarchical semantics, and confusing labels, current methods have limitations. Therefore, this paper proposes a news text classification framework based on hierarchical semantics and prior correction (HSPC). Firstly, data augmentation is used to enhance the diversity of the training set and adversarial learning is employed to improve the resistance of the model with its robustness. Then, a hierarchical feature extraction approach is employed to extract semantic features from different levels of news texts. Consequentially, a feature fusion method is designed to allow the model to focus on relevant hierarchical semantics for label classification. Finally, highly confusing label predictions are corrected to optimize the label prediction of the model and improve confidence. Multiple experiments are performed on four widely used public datasets. The experimental results indicate that HSPC achieves higher classification accuracy compared to other models. On the FCT, AGNews, THUCNews, and Ohsumed datasets, HSPC improves the accuracy by 1.03%, 1.38%, 2.55%, and 1.15%, respectively, compared to state-of-the-art methods. This validates the rationality and effectiveness of the designed mechanisms. © 2024 - The authors. Published by IOS Press.
引用
收藏
页码:8185 / 8203
相关论文
共 50 条
  • [21] Research on the method of educational text classification based on deep learning
    Wang, Yuqin
    INTERNATIONAL JOURNAL OF CONTINUING ENGINEERING EDUCATION AND LIFE-LONG LEARNING, 2022, 32 (03) : 313 - 326
  • [22] Research on Tibetan Text Classification Method Based on Neural Network
    Li, Zhensong
    Zhu, Jie
    Luo, Zhixiang
    Liu, Saihu
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 379 - 383
  • [23] Classification of News and Research Articles Using Text Pattern Mining
    Chaudhari, Sujit V.
    Lade, Shrikant
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (10): : 43 - 47
  • [24] Research on Error Correction Method of Tibetan Text Based on Deep Learning
    Cairang, Huaguo
    Jia, Secha
    Jia, Cairang
    SECOND IYSF ACADEMIC SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND COMPUTER ENGINEERING, 2021, 12079
  • [25] Hierarchical Text Classification and Its Foundations: A Review of Current Research
    Zangari, Alessandro
    Marcuzzo, Matteo
    Rizzo, Matteo
    Giudice, Lorenzo
    Albarelli, Andrea
    Gasparetto, Andrea
    ELECTRONICS, 2024, 13 (07)
  • [26] News Text Classification Model Based on Topic Model
    Li, Zhenzhong
    Shang, Wenqian
    Yan, Menghan
    2016 IEEE/ACIS 15TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 2016, : 1197 - 1201
  • [27] A method of the feature selection in hierarchical text classification based on the category discrimination and position information
    Song, Jia
    Zhang, Pengzhou
    Qin, Sijun
    Gong, Junpeng
    2015 INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS - COMPUTING TECHNOLOGY, INTELLIGENT TECHNOLOGY, INDUSTRIAL INFORMATION INTEGRATION (ICIICII), 2015, : 132 - +
  • [28] Research on improved text classification method based on combined weighted model
    Wang, Yongchang
    Zhu, Ligu
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (06):
  • [29] Text Classification Based on a Novel Bayesian Hierarchical Model
    Zhou, Shibin
    Li, Kan
    Liu, Yushu
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 218 - 221
  • [30] Class Hierarchical Structure-based Text Classification
    Chen, Xiaoyun
    Chen, Jinhua
    ADVANCES IN CIVIL ENGINEERING, PTS 1-6, 2011, 255-260 : 2233 - 2237