Automatic extraction of angiogenesis bioprocess from text

被引:11
|
作者
Wang, Xinglong [1 ,2 ]
McKendrick, Iain [3 ]
Barrett, Ian [3 ]
Dix, Ian [3 ]
French, Tim [3 ]
Tsujii, Jun'ichi [4 ]
Ananiadou, Sophia [1 ,2 ]
机构
[1] Univ Manchester, Natl Ctr Text Min, Manchester, Lancs, England
[2] Univ Manchester, Sch Comp Sci, Manchester, Lancs, England
[3] AstraZeneca, Alderley Pk, England
[4] Microsoft Res Asia, Beijing, Peoples R China
基金
英国生物技术与生命科学研究理事会;
关键词
PROTEIN; MODELS;
D O I
10.1093/bioinformatics/btr460
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Understanding key biological processes (bioprocesses) and their relationships with constituent biological entities and pharmaceutical agents is crucial for drug design and discovery. One way to harvest such information is searching the literature. However, bioprocesses are difficult to capture because they may occur in text in a variety of textual expressions. Moreover, a bioprocess is often composed of a series of bioevents, where a bioevent denotes changes to one or a group of cells involved in the bioprocess. Such bioevents are often used to refer to bioprocesses in text, which current techniques, relying solely on specialized lexicons, struggle to find. Results: This article presents a range of methods for finding bioprocess terms and events. To facilitate the study, we built a gold standard corpus in which terms and events related to angiogenesis, a key biological process of the growth of new blood vessels, were annotated. Statistics of the annotated corpus revealed that over 36% of the text expressions that referred to angiogenesis appeared as events. The proposed methods respectively employed domain-specific vocabularies, a manually annotated corpus and unstructured domain-specific documents. Evaluation results showed that, while a supervised machine-learning model yielded the best precision, recall and F1 scores, the other methods achieved reasonable performance and less cost to develop.
引用
收藏
页码:2730 / 2737
页数:8
相关论文
共 50 条
  • [41] Automatic text classification and property extraction Applications in medicine
    Kolonin, Anton
    2015 INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND COMPUTATIONAL TECHNOLOGIES (SIBIRCON), 2015, : 133 - 137
  • [42] PathBinder – text empirics and automatic extraction of biomolecular interactions
    Lifeng Zhang
    Daniel Berleant
    Jing Ding
    Tuan Cao
    Eve Syrkin Wurtele
    BMC Bioinformatics, 10
  • [43] Chinese Automatic Text Summarization Based on Keyword Extraction
    Jiang Xiao-yu
    FIRST INTERNATIONAL WORKSHOP ON DATABASE TECHNOLOGY AND APPLICATIONS, PROCEEDINGS, 2009, : 225 - 228
  • [44] Automatic text extraction in news images using morphology
    Jang, IY
    Ko, BC
    Byun, H
    Choi, YW
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2002, PTS 1 AND 2, 2002, 4671 : 521 - 530
  • [45] An Automatic Video Text Detection, Localization and Extraction Approach
    Zhu, Chengjun
    Ouyang, Yuanxin
    Gao, Lei
    Chen, Zhenyong
    Xiong, Zhang
    ADVANCED INTERNET BASED SYSTEMS AND APPLICATIONS, 2009, 4879 : 1 - 9
  • [46] PathBinder - text empirics and automatic extraction of biomolecular interactions
    Zhang, Lifeng
    Berleant, Daniel
    Ding, Jing
    Cao, Tuan
    Wurtele, Eve Syrkin
    BMC BIOINFORMATICS, 2009, 10 : S18
  • [47] Automatic text summarization based on sentences clustering and extraction
    Zhang Pei-ying
    Li Cun-he
    2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 1, 2009, : 167 - 170
  • [48] Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families
    Andrade, MA
    Valencia, A
    BIOINFORMATICS, 1998, 14 (07) : 600 - 607
  • [49] A preliminary approach to the automatic extraction of business rules from unrestricted text in the banking industry
    Martinez-Fernandez, Jose L.
    Gonzalez, Jose C.
    Villena, Julio
    Martinez, Paloma
    NATURAL LANGUAGE AND INFORMATION SYSTEMS, PROCEEDINGS, 2008, 5039 : 299 - +
  • [50] Automatic Extraction of Engineering Rules From Unstructured Text: A Natural Language Processing Approach
    Ye, Xinfeng
    Lu, Yuqian
    JOURNAL OF COMPUTING AND INFORMATION SCIENCE IN ENGINEERING, 2020, 20 (03)