A lexical and syntactic analysis system for chinese electronic medical record

被引:0
|
作者
Jiang, Zhipeng [1 ]
Dai, Xue [1 ]
Guan, Yi [1 ]
Zhao, Fangfang [1 ]
机构
[1] Department of Computer Science and Technology, Harbin Institute of Technology, Harbin,150001, China
关键词
CEMR - Chinese word segmentation - Full parsing - Part of speech tagging - Shallow parsing;
D O I
10.14257/ijunesst.2016.9.9.29
中图分类号
学科分类号
摘要
Lexical and syntactic analysis, including word segmentation, part-of-speech (POS) tagging, shallow parsing and full parsing, are essential for medical language processing (MLP). However, research on full parsing, even shallow parsing and POS tagging for Chinese electronic medical record (CEMR), has not been carried out because of the lack of annotated corpus on CEMR. In this paper, we built a corpus of 5,024 sentences from CEMR with word segmentation, POS tags and phrase tags, of them, 2,553 are annotated as full parsing trees. Inter-annotator agreement results: Chinese word segmentation (97.56%), POS tagging (93.34%), shallow parsing (96.5%), full parsing (91.22%). A lexical and syntactic analysis system for CEMR is developed and evaluated based on above corpus. Of its components, we proposed a joint model for word segmentation and POS tagging with the transformation-based error-driven model as correction postprocessing to alleviate the problem of error accumulation, the F1-score of word segmentation and POS tagging were 94.39% and 93.2%, respectively. A shallow parsing model under the framework of group learning we proposed was developed, which enriched word features by word embedding from large unlabeled CEMRs and achieved the F1-score of 96.3%. At last, we presented a state-of-art full parser combining the Berkeley parser and the Stanford parser to outperform the best single parser by 3.68%. The evaluation results show a substantial benefit to statistical machine learning models from the annotated CEMR. These works are the foundation for natural language processing (NLP) technologies applied to CEMR. © 2016 SERSC.
引用
收藏
页码:305 / 318
相关论文
共 50 条
  • [21] Development of an electronic medical record system for the department of Cardiology
    Taddei, A
    Macerata, A
    Carpeggiani, C
    Emdin, M
    Balocchi, R
    Dalmiani, S
    Cecchetti, G
    Pierotti, D
    Marchesi, C
    TOWARD AN ELECTRONIC HEALTH RECORD EUROPE '97 - CONFERENCE ON THE CREATION OF A EUROPEAN ELECTRONIC HEALTH RECORD, CONFERENCE PROCEEDINGS, 1997, : 175 - 177
  • [22] A simple electronic medical record system designed for research
    King, Andrew J.
    Calzoni, Luca
    Tajgardoon, Mohammadamin
    Cooper, Gregory F.
    Clermont, Gilles
    Hochheiser, Harry
    Visweswaran, Shyam
    JAMIA OPEN, 2021, 4 (03)
  • [23] Measuring the benefits of a electronic medical record system in orthopedics
    Dell, R
    Buchanan, B
    Steele, B
    Schilz, J
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 1998, : 993 - 993
  • [24] Electronic Medical Record System based on Augmented Reality
    Weng, Minghui
    Huang, Lianfen
    Feng, Chao
    Gao, Fenglian
    Lin, Hezhi
    2017 12TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND EDUCATION (ICCSE 2017), 2017, : 753 - 756
  • [25] An XML based electronic medical record integration system
    Li, H
    Tang, SW
    Yang, DQ
    Yu, Y
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2001, 2118 : 160 - 167
  • [26] A Proposed Framework for Developing an Electronic Medical Record System
    Xanthidis, Dimitrios
    Xanthidou, Ourania Koutzampasopoulou
    JOURNAL OF GLOBAL INFORMATION MANAGEMENT, 2021, 29 (04) : 78 - 92
  • [27] The electronic medical record
    Rollman, BL
    Hanusa, BH
    Gilbert, T
    Lowe, HJ
    Kapoor, WN
    Schulberg, HC
    ARCHIVES OF INTERNAL MEDICINE, 2001, 161 (02) : 189 - 197
  • [28] The electronic medical record
    Simonian, Mark
    PEDIATRICS IN REVIEW, 2007, 28 (10) : E69 - E76
  • [29] Research and Development of Named Entity Recognition in Chinese Electronic Medical Record
    Du J.-H.
    Yin H.
    Feng S.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (12): : 3030 - 3053
  • [30] A hybrid approach for named entity recognition in Chinese electronic medical record
    Bin Ji
    Rui Liu
    Shasha Li
    Jie Yu
    Qingbo Wu
    Yusong Tan
    Jiaju Wu
    BMC Medical Informatics and Decision Making, 19