Comparative analysis of five protein-protein interaction corpora

被引:113
|
作者
Pyysalo, Sampo [1 ]
Airola, Antti
Heimonen, Juho
Bjorne, Jari
Ginter, Filip
Salakoski, Tapio
机构
[1] Univ Turku, TUCS, FIN-20520 Turku, Finland
关键词
PubMed Abstract; Entity Annotation; Entity Pair; Corpus Annotation; Annotate Entity;
D O I
10.1186/1471-2105-9-S3-S6
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Growing interest in the application of natural language processing methods to biomedical text has led to an increasing number of corpora and methods targeting protein-protein interaction (PPI) extraction. However, there is no general consensus regarding PPI annotation and consequently resources are largely incompatible and methods are difficult to evaluate. Results: We present the first comparative evaluation of the diverse PPI corpora, performing quantitative evaluation using two separate information extraction methods as well as detailed statistical and qualitative analyses of their properties. For the evaluation, we unify the corpus PPI annotations to a shared level of information, consisting of undirected, untyped binary interactions of non-static types with no identification of the words specifying the interaction, no negations, and no interaction certainty. We find that the F-score performance of a state-of-the-art PPI extraction method varies on average 19 percentage units and in some cases over 30 percentage units between the different evaluated corpora. The differences stemming from the choice of corpus can thus be substantially larger than differences between the performance of PPI extraction methods, which suggests definite limits on the ability to compare methods evaluated on different resources. We analyse a number of potential sources for these differences and identify factors explaining approximately half of the variance. We further suggest ways in which the difficulty of the PPI extraction tasks codified by different corpora can be determined to advance comparability. Our analysis also identifies points of agreement and disagreement in PPI corpus annotation that are rarely explicitly stated by the authors of the corpora. Conclusions: Our comparative analysis uncovers key similarities and differences between the diverse PPI corpora, thus taking an important step towards standardization. In the course of this study we have created a major practical contribution in converting the corpora into a shared format. The conversion software is freely available at http://mars.cs.utu.fi/PPICorpora.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Searching for the Holy Grail; protein-protein interaction analysis and modulation
    Morelli, Xavier
    Hupp, Ted
    EMBO REPORTS, 2012, 13 (10) : 877 - 879
  • [42] Protein-protein interaction network and mechanism analysis in ischemic stroke
    Quan, Zhe
    Quan, Yuan
    Wei, Bo
    Fang, Dening
    Yu, Weidong
    Jia, Hao
    Quan, Wei
    Liu, Yuguang
    Wang, Qihong
    MOLECULAR MEDICINE REPORTS, 2015, 11 (01) : 29 - 36
  • [43] A Compact Biosensor for Binding Kinetics Analysis of Protein-Protein Interaction
    Liu, Yun
    Li, Ping
    Zhang, Ning
    Chen, Shimeng
    Liu, Zigeng
    Guang, Jianye
    IEEE SENSORS JOURNAL, 2019, 19 (24) : 11955 - 11960
  • [44] Tools for protein-protein interaction network analysis in cancer research
    Sanz-Pamplona, Rebeca
    Berenguer, Antoni
    Sole, Xavier
    Cordero, David
    Crous-Bou, Marta
    Serra-Musach, Jordi
    Guino, Elisabet
    Angel Pujana, Miguel
    Moreno, Victor
    CLINICAL & TRANSLATIONAL ONCOLOGY, 2012, 14 (01): : 3 - 14
  • [45] Protein-protein interaction network analysis of children atopic asthma
    Liu, Yan
    Liu, Sheng
    EUROPEAN REVIEW FOR MEDICAL AND PHARMACOLOGICAL SCIENCES, 2012, 16 (07) : 867 - 872
  • [46] Topology analysis and visualization of Potyvirus protein-protein interaction network
    Bosque, Gabriel
    Folch-Fortuny, Abel
    Pico, Jesus
    Ferrer, Alberto
    Elena, Santiago F.
    BMC SYSTEMS BIOLOGY, 2014, 8 : 129
  • [47] Analysis of Protein-Protein Interaction Networks Based on Binding Affinity
    Yugandhar, K.
    Gromiha, M. Michael
    CURRENT PROTEIN & PEPTIDE SCIENCE, 2016, 17 (01) : 72 - 81
  • [48] Bias tradeoffs in the creation and analysis of protein-protein interaction networks
    Gillis, Jesse
    Ballouz, Sara
    Pavlidis, Paul
    JOURNAL OF PROTEOMICS, 2014, 100 : 44 - 54
  • [49] Computational Analysis of CHP-NHE Protein-Protein Interaction
    Bell, Isaac Walter
    Latzer, Joachim
    Provost, Joseph John
    FASEB JOURNAL, 2017, 31
  • [50] Biomarkers for ischemic stroke subtypes: A protein-protein interaction analysis
    Wei, Loo Keat
    Quan, Leong Shi
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2019, 83