Learning Regular Expressions for Interpretable Medical Text Classification Using a Pool-based Simulated Annealing Approach

被引:0
|
作者
Tu, Chaofan [1 ]
Cui, Menglin [1 ]
机构
[1] Univ Nottingham, Sch Comp Sci, Ningbo, Peoples R China
关键词
simulated annealing; regular expression; medical text classification;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a rule-based engine composed of high-quality and interpretable regular expressions for medical text classification. The regular expressions are auto-generated by a constructive heuristic method and optimized using a Pool-based Simulated Annealing (PSA) approach. Although existing Deep Neural Network (DNN) methods present high-quality performance in most Natural Language Processing (NLP) applications, the solutions are regarded as uninterpretable "black boxes" to humans. Therefore, rule-based methods are often introduced when interpretable solutions are needed, especially in the medical field. However, the construction of regular expressions can be extremely labor-intensive for large data sets. This research aims to reduce the manual efforts while maintaining high-quality solutions. The Pool-based Simulated Annealing method is proposed to automatically optimize the performance of machine-generated regular expressions without human interference. The proposed method is tested on real-life data provided by one of China's largest online medical platforms. Experimental results show that the proposed PSA method further improves the performance of initial machine-generated regular expressions compared with other meta-heuristics such as Genetic Programming. We also believe that the proposed method can serve as a vital complementary tool for the existing machine learning approaches in text classification applications when high levels of interpretability of the solutions are required.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] RETRACTED ARTICLE: Content based image retrieval using bees algorithm and simulated annealing approach in medical big data applications
    D. Mansoor Hussain
    D. Surendran
    Multimedia Tools and Applications, 2020, 79 : 3683 - 3698
  • [42] Retraction Note: Content based image retrieval using bees algorithm and simulated annealing approach in medical big data applications
    D. Mansoor Hussain
    D. Surendran
    Multimedia Tools and Applications, 2023, 82 : 12739 - 12739
  • [43] Classification of hyperspectral remote sensing images based on simulated annealing genetic algorithm and multiple instance learning
    高红民
    周惠
    徐立中
    石爱业
    Journal of Central South University, 2014, 21 (01) : 262 - 271
  • [44] Classification of hyperspectral remote sensing images based on simulated annealing genetic algorithm and multiple instance learning
    Hong-min Gao
    Hui Zhou
    Li-zhong Xu
    Ai-ye Shi
    Journal of Central South University, 2014, 21 : 262 - 271
  • [45] Classification of hyperspectral remote sensing images based on simulated annealing genetic algorithm and multiple instance learning
    Gao Hong-min
    Zhou Hui
    Xu Li-zhong
    Shi Ai-ye
    JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2014, 21 (01) : 262 - 271
  • [46] A Multimodal Transfer Learning Approach Using PubMedCLIP for Medical Image Classification
    Dao, Hong N.
    Nguyen, Tuyen
    Mugisha, Cherubin
    Paik, Incheon
    IEEE ACCESS, 2024, 12 : 75496 - 75507
  • [47] Medical image classification for Alzheimer’s using a deep learning approach
    Bamber S.S.
    Vishvakarma T.
    Journal of Engineering and Applied Science, 2023, 70 (01):
  • [48] Text Classification in Clinical Practice Guidelines Using Machine-Learning Assisted Pattern-Based Approach
    Hussain, Musarrat
    Hussain, Jamil
    Ali, Taqdir
    Ali, Syed Imran
    Bilal, Hafiz Syed Muhammad
    Lee, Sungyoung
    Chung, Taechoong
    APPLIED SCIENCES-BASEL, 2021, 11 (08):
  • [49] Advanced prediction of events and temporal expressions in medical text using the Jena API: integrating ontologies and deep learning
    Tiaiba, Hafida
    Sabri, Lyazid
    Kazar, Okba
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2025, 33 (01)
  • [50] Unit commitment using particle swarm-based-simulated annealing optimization approach
    Sadati, Nasser
    Hajian, Mahdi
    Zamani, Majid
    2007 IEEE SWARM INTELLIGENCE SYMPOSIUM, 2007, : 297 - +