Automatic extraction of angiogenesis bioprocess from text

被引:11
|
作者
Wang, Xinglong [1 ,2 ]
McKendrick, Iain [3 ]
Barrett, Ian [3 ]
Dix, Ian [3 ]
French, Tim [3 ]
Tsujii, Jun'ichi [4 ]
Ananiadou, Sophia [1 ,2 ]
机构
[1] Univ Manchester, Natl Ctr Text Min, Manchester, Lancs, England
[2] Univ Manchester, Sch Comp Sci, Manchester, Lancs, England
[3] AstraZeneca, Alderley Pk, England
[4] Microsoft Res Asia, Beijing, Peoples R China
基金
英国生物技术与生命科学研究理事会;
关键词
PROTEIN; MODELS;
D O I
10.1093/bioinformatics/btr460
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Understanding key biological processes (bioprocesses) and their relationships with constituent biological entities and pharmaceutical agents is crucial for drug design and discovery. One way to harvest such information is searching the literature. However, bioprocesses are difficult to capture because they may occur in text in a variety of textual expressions. Moreover, a bioprocess is often composed of a series of bioevents, where a bioevent denotes changes to one or a group of cells involved in the bioprocess. Such bioevents are often used to refer to bioprocesses in text, which current techniques, relying solely on specialized lexicons, struggle to find. Results: This article presents a range of methods for finding bioprocess terms and events. To facilitate the study, we built a gold standard corpus in which terms and events related to angiogenesis, a key biological process of the growth of new blood vessels, were annotated. Statistics of the annotated corpus revealed that over 36% of the text expressions that referred to angiogenesis appeared as events. The proposed methods respectively employed domain-specific vocabularies, a manually annotated corpus and unstructured domain-specific documents. Evaluation results showed that, while a supervised machine-learning model yielded the best precision, recall and F1 scores, the other methods achieved reasonable performance and less cost to develop.
引用
收藏
页码:2730 / 2737
页数:8
相关论文
共 50 条
  • [1] Automatic Extraction of Causal Chains from Text
    Huminski, Aliaksandr
    Bin, Ng Yan
    LIBRES-LIBRARY AND INFORMATION SCIENCE RESEARCH ELECTRONIC JOURNAL, 2019, 29 (02): : 99 - 108
  • [2] Automatic extraction of hierarchical relations from text
    Wang, Ting
    Li, Yaoyong
    Bontcheva, Kalina
    Cunningham, Hamish
    Wang, Ji
    SEMANTIC WEB: RESEARCH AND APPLICATIONS, PROCEEDINGS, 2006, 4011 : 215 - 229
  • [3] Automatic Keyword Extraction From Dialogue Text
    Sali, Yusuf
    Erden, Mustafa
    2022 30TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU, 2022,
  • [4] AUTOMATIC EXTRACTION OF FUNCTION KNOWLEDGE FROM TEXT
    Cheong, Hyunmin
    Li, Wei
    Cheung, Adrian
    Nogueira, Andy
    Iorio, Francesco
    INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2015, VOL 2A, 2016,
  • [5] Automatic extraction of collocations from Korean text
    Kim, S
    Yoon, J
    Song, MS
    COMPUTERS AND THE HUMANITIES, 2001, 35 (03): : 273 - 297
  • [6] Automatic text extraction from color image
    Liu, WP
    Su, H
    Chi, CY
    VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2000, PTS 1-3, 2000, 4067 : 1544 - 1550
  • [7] Automatic Text Extraction from Arabic Newspapers
    Vasilopoulos, Nikos
    Wasfi, Yazan
    Kavallieratou, Ergina
    IMAGE ANALYSIS AND RECOGNITION (ICIAR 2018), 2018, 10882 : 505 - 510
  • [8] Automatic Relation Extraction from Text: A Survey
    Li, Kun
    Zhang, Junsheng
    Yao, Changqing
    Shi, Chongde
    2016 INTERNATIONAL CONFERENCE ON IDENTIFICATION, INFORMATION AND KNOWLEDGE IN THE INTERNET OF THINGS (IIKI), 2016, : 83 - 86
  • [9] Automatic Extraction of Collocations From Korean Text
    Seonho Kim
    Juntae Yoon
    Mansuk Song
    Computers and the Humanities, 2001, 35 (3): : 273 - 297
  • [10] Deep Text Mining for Automatic Keyphrase Extraction from Text Documents
    Abulaish, Muhammad
    Jahiruddin
    Dey, Lipika
    JOURNAL OF INTELLIGENT SYSTEMS, 2011, 20 (04) : 327 - 351