A Chinese Text Classification Model Based on Radicals and Character Distinctions

被引:1
|
作者
Yan-Xin, Huang [1 ]
Bo, Li [1 ]
机构
[1] Chongqing Univ Technol, Coll Comp Sci & Engn, Chongqing 400054, Peoples R China
关键词
Semantics; Bit error rate; Text categorization; Feature extraction; Deep learning; Transformers; Data mining; China; Radicals; traditional Chinese; Chinese text classification;
D O I
10.1109/ACCESS.2023.3257339
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Chinese characters are generally correlated with their semantic meanings, and the structure of radicals, in particular, can be a clear indication of how characters are related to each other. In the Chinese characters simplification movement, some different traditional characters have been transferred into one simplified character (many-to-one mapping), resulting in the phenomenon of 'one simplified character corresponding to many traditional characters. Compared to the simplified characters, the traditional characters contain richer structural information, which is also more meaningful to semantic understanding. Traditional approaches of text modelling often overlook the structural content of Chinese characters and the role of human cognitive behaviour in the process of text comprehension. Hence, we propose a Chinese text classification model derived from the construction methods and evolution of Chinese characters. The model consists of two branches: the simplified and the traditional, with an attention module based on the radical classification in each branch. Specifically, we first develop a sequential modelling structure to obtain sequence information of Chinese texts. Afterwards, an associated word module using the part head as a medium is designed to filter out keywords with high semantic differentiation among the auxiliary units. An attention module is then implemented to balance the importance of each keyword in a particular context. Our proposed method is conducted on three datasets to demonstrate validity and plausibility.
引用
收藏
页码:45520 / 45526
页数:7
相关论文
共 50 条
  • [21] Processing of radicals in Chinese character recognition
    Li, H
    Chen, HC
    COGNITIVE PROCESSING OF CHINESE AND RELATED ASIAN LANGUAGES, 1997, : 141 - 160
  • [22] A Compact CNN-DBLSTM Based Character Model For Online Handwritten Chinese Text Recognition
    Chen, Kai
    Tian, Li
    Ding, Haisong
    Cai, Meng
    Sun, Lei
    Liang, Sen
    Huo, Qiang
    2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1068 - 1073
  • [23] Chinese Character Image Clustering and Classification Based on Object Embedding Model (Student Abstract)
    Wang, Mengting
    Liang, Xun
    Xue, Yang
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15913 - 15914
  • [24] MPCNN with Knowledge Augmentation: A Model for Chinese Text Classification
    Zhang, Xiaozeng
    Fang, Ailian
    INTELLIGENT COMPUTING METHODOLOGIES, PT III, 2022, 13395 : 141 - 149
  • [25] A Model Ensemble Approach with LLM for Chinese Text Classification
    Wu, Chengyan
    Fang, Wenlong
    Dai, Feipeng
    Yin, Hailong
    HEALTH INFORMATION PROCESSING: EVALUATION TRACK PAPERS, CHIP 2023, 2024, 2080 : 214 - 230
  • [26] A CHINESE CHARACTER-LEVEL AND WORD-LEVEL COMPLEMENTARY TEXT CLASSIFICATION METHOD
    Chen, Wentong
    Fan, Chunxiao
    Wu, Yuexin
    Lou, Zhixiong
    2020 25TH INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI 2020), 2020, : 187 - 192
  • [27] A Hybrid Classification Method via Character Embedding in Chinese Short Text With Few Words
    Zhu, Yi
    Li, Yun
    Yue, Yongzheng
    Qiang, Jipeng
    Yuan, Yunhao
    IEEE ACCESS, 2020, 8 : 92120 - 92128
  • [28] Text Classification Model Based on fastText
    Yao, Tengjun
    Zhai, Zhengang
    Gao, Bingtao
    PROCEEDINGS OF 2020 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INFORMATION SYSTEMS (ICAIIS), 2020, : 154 - 157
  • [29] Chinese Text Classification Based On Improved Domain Ontology Graph Model-DOG*
    Yang, Guang
    Tian, Jin-Kun
    Liu, Yun-Hua
    Lin, Zhong-Yi
    Wang, Lei
    Chang, Yu-Xin
    PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1331 - 1335
  • [30] Implicit Sentiment Classification Model Based on Enhancement of Sentiment Features Oriented to Chinese Text
    Tan, Guangpu
    Zhu, Guangli
    Wei, Siyu
    Computer Engineering and Applications, 2024, 60 (03) : 196 - 204