Subsequence and distant supervision based active learning for relation extraction of Chinese medical texts

被引:1
|
作者
Ye, Qi [1 ]
Cai, Tingting [1 ]
Ji, Xiang [1 ]
Ruan, Tong [1 ]
Zheng, Hong [1 ]
机构
[1] East China Univ Sci & Technol, Sch Informat Sci & Technol, Shanghai 200237, Peoples R China
关键词
Active learning; Sequence tagging; Relation extraction; Distant supervision; Medical texts;
D O I
10.1186/s12911-023-02127-1
中图分类号
R-058 [];
学科分类号
摘要
In recent years, relation extraction on unstructured texts has become an important task in medical research. However, relation extraction requires a large amount of labeled corpus, manually annotating sequences is time consuming and expensive. Therefore, efficient and economical methods for annotating sequences are required to ensure the performance of relational extraction. This paper proposes a method of subsequence and distant supervision based active learning. The method is annotated by selecting information-rich subsequences as a sampling unit instead of the full sentences in traditional active learning. Additionally, the method saves the labeled subsequence texts and their corresponding labels in a dictionary which is continuously updated and maintained, and pre-labels the unlabeled set through text matching based on the idea of distant supervision. Finally, the method combines a Chinese-RoBERTa-CRF model for relation extraction in Chinese medical texts. Experimental results test on the CMeIE dataset achieves the best performance compared to existing methods. And the best F1 value obtained between different sampling strategies is 55.96%.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Bootstrapped Multi-level Distant Supervision for Relation Extraction
    He, Ying
    Li, Zhixu
    Liu, Guanfeng
    Cao, Fangfei
    Chen, Zhigang
    Wang, Ke
    Ma, Jie
    WEB INFORMATION SYSTEMS ENGINEERING, WISE 2018, PT I, 2018, 11233 : 408 - 423
  • [42] Distant Supervision for Relation Extraction with Hierarchical Attention and Entity Descriptions
    She, Heng
    Wu, Bin
    Wang, Bai
    Chi, Renjun
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [43] Domain Adapted Distant Supervision for Pedagogically Motivated Relation Extraction
    Sain, Oscar
    Lopez de Lacall, Oier
    Aldab, Itziar
    Maritxala, Montse
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2213 - 2222
  • [44] GAN Driven Semi-distant Supervision for Relation Extraction
    Li, Pengshuai
    Zhang, Xinsong
    Jia, Weijia
    Zhao, Hai
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3026 - 3035
  • [45] Improving Open Information Extraction with Distant Supervision Learning
    Jiabao Han
    Hongzhi Wang
    Neural Processing Letters, 2021, 53 : 3287 - 3306
  • [46] Distant supervision for treatment relation extraction by leveraging MeSH subheadings
    Tran, Tung
    Kavuluru, Ramakanth
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2019, 98 : 18 - 26
  • [47] Distant Supervision for Relation Extraction Using Ontology Class Hierarchy-Based Features
    Assis, Pedro H. R.
    Casanova, Marco A.
    SEMANTIC WEB: ESWC 2014 SATELLITE EVENTS, 2014, 8798 : 467 - 471
  • [48] Multi-language Person Social Relation Extraction Model Based on Distant Supervision
    Huang, Yangchen
    Jia, Yan
    Huang, Jiuming
    He, Zhonghe
    2018 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2018, : 368 - 374
  • [49] Sentence-level Distant Supervision Relation Extraction based on Dynamic Soft Labels
    Hou, Dejun
    Zhang, Zefeng
    Zhao, Mankun
    Zhang, Wenbin
    Zhao, Yue
    Yu, Jian
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 3194 - 3199
  • [50] Hybrid attention-based transformer block model for distant supervision relation extraction
    Xiao, Yan
    Jin, Yaochu
    Cheng, Ran
    Hao, Kuangrong
    NEUROCOMPUTING, 2022, 470 : 29 - 39