A lexical and syntactic analysis system for chinese electronic medical record

被引:0
|
作者
Jiang, Zhipeng [1 ]
Dai, Xue [1 ]
Guan, Yi [1 ]
Zhao, Fangfang [1 ]
机构
[1] Department of Computer Science and Technology, Harbin Institute of Technology, Harbin,150001, China
关键词
CEMR - Chinese word segmentation - Full parsing - Part of speech tagging - Shallow parsing;
D O I
10.14257/ijunesst.2016.9.9.29
中图分类号
学科分类号
摘要
Lexical and syntactic analysis, including word segmentation, part-of-speech (POS) tagging, shallow parsing and full parsing, are essential for medical language processing (MLP). However, research on full parsing, even shallow parsing and POS tagging for Chinese electronic medical record (CEMR), has not been carried out because of the lack of annotated corpus on CEMR. In this paper, we built a corpus of 5,024 sentences from CEMR with word segmentation, POS tags and phrase tags, of them, 2,553 are annotated as full parsing trees. Inter-annotator agreement results: Chinese word segmentation (97.56%), POS tagging (93.34%), shallow parsing (96.5%), full parsing (91.22%). A lexical and syntactic analysis system for CEMR is developed and evaluated based on above corpus. Of its components, we proposed a joint model for word segmentation and POS tagging with the transformation-based error-driven model as correction postprocessing to alleviate the problem of error accumulation, the F1-score of word segmentation and POS tagging were 94.39% and 93.2%, respectively. A shallow parsing model under the framework of group learning we proposed was developed, which enriched word features by word embedding from large unlabeled CEMRs and achieved the F1-score of 96.3%. At last, we presented a state-of-art full parser combining the Berkeley parser and the Stanford parser to outperform the best single parser by 3.68%. The evaluation results show a substantial benefit to statistical machine learning models from the annotated CEMR. These works are the foundation for natural language processing (NLP) technologies applied to CEMR. © 2016 SERSC.
引用
收藏
页码:305 / 318
相关论文
共 50 条
  • [41] The search for the elusive electronic medical record system - Medical liability, the missing factor
    Grams R.R.
    Moyer E.H.
    Journal of Medical Systems, 1997, 21 (1) : 1 - 10
  • [42] Implementing an electronic patient medical record system in a mobile medical unit environment
    Sherman, P
    Grant, R
    James, F
    Pruitt, J
    Redlener, I
    Seim, L
    Weismann, J
    Robinson, SC
    PEDIATRIC RESEARCH, 2004, 55 (04) : 265A - 265A
  • [43] An experimental electronic medical-record system with multiple views on medical narratives
    Tange, HJ
    Dreessen, VAB
    Hasman, A
    Donkers, HHLM
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 1997, 54 (03) : 157 - 172
  • [44] A Blended Model of Electronic Medical Record System Adoption in Canadian Medical Practices
    Cocosila, Mihail
    Archer, Norm
    COMMUNICATIONS OF THE ASSOCIATION FOR INFORMATION SYSTEMS, 2016, 39 : 483 - 508
  • [45] A blended model of electronic medical record system adoption in Canadian medical practices
    Cocosila M.
    Archer N.
    Communications of the Association for Information Systems, 2016, 39 (01): : 483 - 508
  • [46] THE LEXICAL, SYNTACTIC AND SEMANTIC PROCESSING OF A SPEECH RECOGNITION SYSTEM
    RIVOIRA, S
    TORASSO, P
    INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1982, 16 (01): : 39 - 63
  • [47] Application of Electronic Health Record System for Teaching Medical Students
    Karas, Sergey
    Merker, Eduard
    Korneva, Irina
    Ponomarev, Alexey
    Kopanitsa, Georgy
    CROSS-BORDER CHALLENGES IN INFORMATICS WITH A FOCUS ON DISEASE SURVEILLANCE AND UTILISING BIG DATA, 2014, 197 : 127 - 127
  • [48] Use of an electronic medical record system to improve antimicrobial stewardship
    P Allan
    M Newman
    J Collinson
    L Bond
    W English
    Critical Care, 19 (Suppl 1):
  • [49] Termination of a contract to implement an enterprise electronic medical record system
    Goddard, BL
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2000, 7 (06) : 564 - 568
  • [50] Secure Scalable Disaster Electronic Medical Record and Tracking System
    DeMers, Gerard
    Kahn, Christopher
    Johansson, Per
    Buono, Colleen
    Chipara, Octav
    Griswold, William
    Chan, Theodore
    PREHOSPITAL AND DISASTER MEDICINE, 2013, 28 (05) : 498 - 501