A lexical and syntactic analysis system for chinese electronic medical record

被引:0
|
作者
Jiang, Zhipeng [1 ]
Dai, Xue [1 ]
Guan, Yi [1 ]
Zhao, Fangfang [1 ]
机构
[1] Department of Computer Science and Technology, Harbin Institute of Technology, Harbin,150001, China
关键词
CEMR - Chinese word segmentation - Full parsing - Part of speech tagging - Shallow parsing;
D O I
10.14257/ijunesst.2016.9.9.29
中图分类号
学科分类号
摘要
Lexical and syntactic analysis, including word segmentation, part-of-speech (POS) tagging, shallow parsing and full parsing, are essential for medical language processing (MLP). However, research on full parsing, even shallow parsing and POS tagging for Chinese electronic medical record (CEMR), has not been carried out because of the lack of annotated corpus on CEMR. In this paper, we built a corpus of 5,024 sentences from CEMR with word segmentation, POS tags and phrase tags, of them, 2,553 are annotated as full parsing trees. Inter-annotator agreement results: Chinese word segmentation (97.56%), POS tagging (93.34%), shallow parsing (96.5%), full parsing (91.22%). A lexical and syntactic analysis system for CEMR is developed and evaluated based on above corpus. Of its components, we proposed a joint model for word segmentation and POS tagging with the transformation-based error-driven model as correction postprocessing to alleviate the problem of error accumulation, the F1-score of word segmentation and POS tagging were 94.39% and 93.2%, respectively. A shallow parsing model under the framework of group learning we proposed was developed, which enriched word features by word embedding from large unlabeled CEMRs and achieved the F1-score of 96.3%. At last, we presented a state-of-art full parser combining the Berkeley parser and the Stanford parser to outperform the best single parser by 3.68%. The evaluation results show a substantial benefit to statistical machine learning models from the annotated CEMR. These works are the foundation for natural language processing (NLP) technologies applied to CEMR. © 2016 SERSC.
引用
收藏
页码:305 / 318
相关论文
共 50 条
  • [1] Electronic Medical Record Information System for Patient Consultations in Chinese Medicine
    Bjering, Heidi
    Ginige, Athula
    Maeder, Anthony
    Bensoussan, Alan
    Zhu, Xiaoshu
    Lattuca, Charles
    HEALTH INFORMATICS: THE TRANSFORMATIVE POWER OF INNOVATION, 2011, 168 : 10 - 15
  • [2] Development of electronic medical record system
    Hazumi, Michihiro
    Kawamoto, Toshio
    NEC Research and Development, 2000, 41 (01): : 102 - 105
  • [3] Development of electronic medical record system
    Hazumi, M
    Kawamoto, T
    NEC RESEARCH & DEVELOPMENT, 2000, 41 (01): : 102 - 105
  • [4] Electronic medical record system MegaOakHR
    2nd Electronic Medical Record Development Group, Medical Systems Division, Community and Medical Solutions Operations Unit
    不详
    NEC Tech. J., 2008, 3 (89-93):
  • [5] The Perfect Electronic Medical Record System
    Bach, Austin
    Singer, Moishe B.
    Bach, Miriam
    JOURNAL OF THE AMERICAN OSTEOPATHIC ASSOCIATION, 2010, 110 (10): : 614 - 615
  • [6] Electronic medical record system "MegaOakHR"
    Namikawa, Hirokazu
    Miyakawa, Riki
    Sato, Yusuke
    Takashima, Koji
    NEC TECHNICAL JOURNAL, 2008, 3 (03): : 89 - 93
  • [7] A Fusion Model for Chinese Electronic Medical Record Parsing
    Jiang Z.-P.
    Guan Y.
    Zidonghua Xuebao/Acta Automatica Sinica, 2019, 45 (02): : 276 - 288
  • [8] A Transition-Based System for Joint Lexical and Syntactic Analysis
    Constant, Matthieu
    Nivre, Joakim
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 161 - 171
  • [9] Kibera Medical Record Initiative: Barriers of Implementing an Electronic Medical Record System
    Jawhari, Badeia
    Berenger, Byron
    Saleh, Abdullah
    DIGITAL HEALTHCARE EMPOWERING EUROPEANS, 2015, 210 : 1031 - 1032
  • [10] Electronic medical record system: A critical viewpoint
    Bakshi, Sumitra G.
    Trivedi, Bhakti
    INDIAN JOURNAL OF ANAESTHESIA, 2018, 62 (07) : 564 - +