Subsequence and distant supervision based active learning for relation extraction of Chinese medical texts

被引:1
|
作者
Ye, Qi [1 ]
Cai, Tingting [1 ]
Ji, Xiang [1 ]
Ruan, Tong [1 ]
Zheng, Hong [1 ]
机构
[1] East China Univ Sci & Technol, Sch Informat Sci & Technol, Shanghai 200237, Peoples R China
关键词
Active learning; Sequence tagging; Relation extraction; Distant supervision; Medical texts;
D O I
10.1186/s12911-023-02127-1
中图分类号
R-058 [];
学科分类号
摘要
In recent years, relation extraction on unstructured texts has become an important task in medical research. However, relation extraction requires a large amount of labeled corpus, manually annotating sequences is time consuming and expensive. Therefore, efficient and economical methods for annotating sequences are required to ensure the performance of relational extraction. This paper proposes a method of subsequence and distant supervision based active learning. The method is annotated by selecting information-rich subsequences as a sampling unit instead of the full sentences in traditional active learning. Additionally, the method saves the labeled subsequence texts and their corresponding labels in a dictionary which is continuously updated and maintained, and pre-labels the unlabeled set through text matching based on the idea of distant supervision. Finally, the method combines a Chinese-RoBERTa-CRF model for relation extraction in Chinese medical texts. Experimental results test on the CMeIE dataset achieves the best performance compared to existing methods. And the best F1 value obtained between different sampling strategies is 55.96%.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] A Hybrid Graph Model for Distant Supervision Relation Extraction
    Duan, Shangfu
    Gao, Huan
    Liu, Bing
    Qi, Guilin
    SEMANTIC WEB, ESWC 2019, 2019, 11503 : 36 - 51
  • [32] An autoencoder-based representation for noise reduction in distant supervision of relation extraction
    Garcia-Mendoza, Juan-Luis
    Villasenor-Pineda, Luis
    Orihuela-Espina, Felipe
    Bustio-Martinez, Lazaro
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (05) : 4523 - 4529
  • [33] Distant Supervision Relation Extraction Model Based on Feature-recalibration Networks
    Chang, Tianji
    Yang, Wu
    Liang, Qingmin
    Wang, Yue
    PROCEEDINGS OF THE 15TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2020), 2020, : 1148 - 1153
  • [34] Personal Attributes Extraction in Chinese Text Based on Distant-Supervision and LSTM
    Yao, Wenxi
    Liu, Jin
    Cai, Zehuan
    ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2018, 474 : 511 - 515
  • [35] Deep ranking based cost-sensitive multi-label learning for distant supervision relation extraction
    Ye, Hai
    Luo, Zhunchen
    INFORMATION PROCESSING & MANAGEMENT, 2020, 57 (06)
  • [36] Improving Open Information Extraction with Distant Supervision Learning
    Han, Jiabao
    Wang, Hongzhi
    NEURAL PROCESSING LETTERS, 2021, 53 (05) : 3287 - 3306
  • [37] DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction
    Qin, Pengda
    Xu, Weiran
    Wang, William Yang
    PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 496 - 505
  • [38] Distant Supervision for Relation Extraction with Sentence Selection and Interaction Representation
    Chen, Tiantian
    Wang, Nianbin
    Wang, Hongbin
    Zhan, Haomin
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
  • [39] Denoising Distant Supervision for Relation Extraction with Entropy Weight Method
    Lu, Mengyi
    Liu, Pengyuan
    CHINESE COMPUTATIONAL LINGUISTICS, CCL 2019, 2019, 11856 : 294 - 305
  • [40] Distant supervision for relation extraction with weak constraints of entity pairs
    Ouyang D.-T.
    Xiao J.
    Ye Y.-X.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2019, 49 (03): : 912 - 919