Annotation Scheme and Corpus Construction for Cardiovascular Diseases Risk Factors From Chinese Electronic Medical Records

被引:0
|
作者
Su J. [1 ]
He B. [1 ]
Wu H. [2 ]
Yang J.-F. [3 ]
Guan Y. [1 ]
Jiang J.-C. [1 ]
Wang H.-Z. [1 ]
Yu Q.-B. [2 ]
机构
[1] Language Technology Research Center, School of Computer Science and Technology, Harbin Institute of Technology, Harbin
[2] The 2nd Affiliated Hospital of Harbin Medical University, Harbin
[3] School of Software, Harbin University of Science and Technology, Harbin
来源
基金
中国国家自然科学基金;
关键词
Cardiovascular diseases (CVDs); Chinese electronic medical records (CEMRs); Corpus annotation; Natural language processing; Risk factors;
D O I
10.16383/j.aas.2018.c170206
中图分类号
学科分类号
摘要
In this paper, the issue of annotating cardiovascular diseases (CVDs) risk factors and the related information from Chinese electronic medical records (CEMRs) is discussed and an annotation scheme of CVDs risk factors appropriate to the content characteristics of CEMRs is put forward. Furthermore, the first annotated corpus of CVDs risk factors in the field of Chinese health information processing is constructed. Copyright © 2019 Acta Automatica Sinica. All rights reserved.
引用
收藏
页码:420 / 426
页数:6
相关论文
共 33 条
  • [1] Cardiovascular diseases (CVDs), (2017)
  • [2] Gasparyan A.Y., Cardiovascular Risk Factor, pp. 1-102, (2012)
  • [3] Friedman C., Kra P., Rzhetsky A., Two biomedical sublanguages: a description based on the theories of Zellig Harris, Journal of Biomedical Informatics, 35, 4, pp. 222-235, (2002)
  • [4] Stubbs A., Uzuner O., Annotating risk factors for heart disease in clinical narratives for diabetic patients, Journal of Biomedical Informatics, 58, pp. S78-S91, (2015)
  • [5] Marcus M.P., Marcinkiewicz M.A., Santorini B., Building a large annotated corpus of English: the Penn Treebank, Computational Linguistics, 19, 2, pp. 313-330, (1993)
  • [6] Kim J.D., Ohta T., Tateisi Y., Tsujii J., GENIA corpus-semantically annotated corpus for bio-textmining, Bioinformatics, 19, pp. i180-i182, (2003)
  • [7] Uzuner O., Goldstein I., Luo Y., Kohane I., Identifying patient smoking status from medical discharge records, Journal of the American Medical Informatics Association, 15, 1, pp. 14-24, (2008)
  • [8] Uzuner O., Recognizing obesity and comorbidities in sparse data, Journal of the American Medical Informatics Association, 16, 4, pp. 561-570, (2009)
  • [9] Uzuner O., Solti I., Cadag E., Extracting medication information from clinical text, Journal of the American Medical Informatics Association, 17, 5, pp. 514-518, (2010)
  • [10] Uzuner O., South B.R., Shen S.Y., Duvall S.L., 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text, Journal of the American Medical Informatics Association, 18, 5, pp. 552-556, (2011)