Research of news text classification method based on hierarchical semantics and prior correction

被引:0
|
作者
Sun, Ping [1 ]
Song, LinLin [2 ]
Yuan, Ling [2 ]
Yu, Haiping [1 ]
Wei, Yinzhen [1 ]
机构
[1] Wuhan Vocational College of Software and Engineering, Hubei, Wuhan, China
[2] School of Computer Science and Technology, Huazhong University of Science and Technology, Hubei, Wuhan, China
来源
基金
中国国家自然科学基金;
关键词
Classification (of information) - Deep learning - Learning algorithms - Learning systems - Natural language processing systems - Text processing;
D O I
10.3233/JIFS-238433
中图分类号
学科分类号
摘要
News text is an important branch of natural language processing. Compared to ordinary texts, news text has significant economic and scientific value. The characteristics of news text include structural hierarchy, diverse label categories, and limited high-quality annotation samples. Many machine learning and deep learning methods exist to analyze various forms of news text. However, due to label imbalance, hierarchical semantics, and confusing labels, current methods have limitations. Therefore, this paper proposes a news text classification framework based on hierarchical semantics and prior correction (HSPC). Firstly, data augmentation is used to enhance the diversity of the training set and adversarial learning is employed to improve the resistance of the model with its robustness. Then, a hierarchical feature extraction approach is employed to extract semantic features from different levels of news texts. Consequentially, a feature fusion method is designed to allow the model to focus on relevant hierarchical semantics for label classification. Finally, highly confusing label predictions are corrected to optimize the label prediction of the model and improve confidence. Multiple experiments are performed on four widely used public datasets. The experimental results indicate that HSPC achieves higher classification accuracy compared to other models. On the FCT, AGNews, THUCNews, and Ohsumed datasets, HSPC improves the accuracy by 1.03%, 1.38%, 2.55%, and 1.15%, respectively, compared to state-of-the-art methods. This validates the rationality and effectiveness of the designed mechanisms. © 2024 - The authors. Published by IOS Press.
引用
收藏
页码:8185 / 8203
相关论文
共 50 条
  • [1] A News Name Correction Method Based on Context Semantics
    Yang Y.
    Huang R.-Z.
    Wei Q.
    Chen Y.-P.
    Qin Y.-B.
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2019, 48 (06): : 809 - 814
  • [2] Research on Chinese News Text Classification Based on ERNIE Model
    Zhang, Wenxu
    PROCEEDINGS OF THE WORLD CONFERENCE ON INTELLIGENT AND 3-D TECHNOLOGIES, WCI3DT 2022, 2023, 323 : 89 - 100
  • [3] Short Text Classification Based on Semantics
    Ma, Chenglong
    Wan, Xin
    Zhang, Zhen
    Li, Taisong
    Zhang, Yan
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, ICIC 2015, PT III, 2015, 9227 : 463 - 470
  • [4] Chinese News Text Classification Method Based On Attention Mechanism
    Ruan, Jinjun
    Caballero, Jonathan M.
    Juanatas, Ronaldo A.
    2022 7TH INTERNATIONAL CONFERENCE ON BUSINESS AND INDUSTRIAL RESEARCH (ICBIR2022), 2022, : 330 - 334
  • [5] A method for Chinese text classification based on apparent semantics and latent aspects
    Ye-Wang Chen
    Jiong-Liang Wang
    Yi-Qiao Cai
    Ji-Xiang Du
    Journal of Ambient Intelligence and Humanized Computing, 2015, 6 : 473 - 480
  • [6] A method for Chinese text classification based on apparent semantics and latent aspects
    Chen, Ye-Wang
    Wang, Jiong-Liang
    Cai, Yi-Qiao
    Du, Ji-Xiang
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2015, 6 (04) : 473 - 480
  • [7] Hierarchical classification in text mining for sentiment analysis of online news
    Jinyan Li
    Simon Fong
    Yan Zhuang
    Richard Khoury
    Soft Computing, 2016, 20 : 3411 - 3420
  • [8] Hierarchical classification in text mining for sentiment analysis of online news
    Li, Jinyan
    Fong, Simon
    Zhuang, Yan
    Khoury, Richard
    SOFT COMPUTING, 2016, 20 (09) : 3411 - 3420
  • [9] HTCSI: A Hierarchical Text Classification Method Based on Selection-Inference
    Xu, Yiming
    Feng, Jianzhou
    Gu, Chenghan
    Qin, Haonan
    Xue, Kehan
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT II, NLPCC 2024, 2025, 15360 : 307 - 318
  • [10] Text feature selection method for hierarchical classification
    Zhu, Cui-Ling
    Ma, Jun
    Zhang, Dong-Mei
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2011, 24 (01): : 103 - 110