Imbalanced Text Classification on Host Pathogen Protein-Protein Interaction Documents

被引:3
|
作者
Xu, Guixian [1 ,2 ]
Niu, Zhendong [2 ]
Gao, Xu [4 ]
Liu, Hongfang [3 ]
机构
[1] Minzu Univ, Coll Informat Engn, Beijing, Peoples R China
[2] Beijing Inst Technol, Coll Comp Sci, Beijing, Peoples R China
[3] Georgetown Univ, Med Ctr, Dept Bio3, Washington, DC 20007 USA
[4] North China Grid Co Ltd, Beijing, Peoples R China
来源
2010 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2010), VOL 1 | 2010年
基金
美国国家科学基金会;
关键词
imbalanced text classification; machine learning; protein-protein interaction;
D O I
10.1109/ICCAE.2010.5451921
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
important in understanding the fundamental processes governing cell biology. However, a large number of scientific findings about PPIs are buried in the growing volume of biomedical literature. Document classification systems have been shown to have the potential to accelerate the curation process by retrieving PPI-related documents. However, it is usually a case that a small number of positive documents can be obtained manually or from PPI knowledge bases with literature-based evidence and there are a large number of negative documents. In this paper, we investigate the effects of feature selection and feature weighting as well as kernel function of Support Vector Machines (SVMs) on imbalanced two-class classification based on 1360 host-pathogen protein-protein interactions documents. The results show that the suitable feature weighting approach is the important factor for improving the classification performance. Adjusting cost sensitive parameter of radial basis function (RBF) kernel of SVM can decrease the minority class misclassification ratio and increase the classification accuracy on imbalanced documents classification. An automated classification system to identify MEDLINE abstracts referring to host-pathogen protein-protein interactions can been developed based on the experiment.
引用
收藏
页码:418 / 422
页数:5
相关论文
共 50 条
  • [31] Bioinformatic Prediction of WSSV-Host Protein-Protein Interaction
    Sun, Zheng
    Li, Shihao
    Li, Fuhua
    Xiang, Jianhai
    BIOMED RESEARCH INTERNATIONAL, 2014, 2014
  • [32] The Landscape of Virus-Host Protein-Protein Interaction Databases
    Valiente, Gabriel
    FRONTIERS IN MICROBIOLOGY, 2022, 13
  • [33] A framework towards data analytics on host-pathogen protein-protein interactions
    Chen, Huaming
    Shen, Jun
    Wang, Lei
    Song, Jiangning
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2020, 11 (11) : 4667 - 4679
  • [34] A Comparative Approach to Characterize the Landscape of Host-Pathogen Protein-Protein Interactions
    Muller, Mandy
    Cassonnet, Patricia
    Favre, Michel
    Jacob, Yves
    Demeret, Caroline
    JOVE-JOURNAL OF VISUALIZED EXPERIMENTS, 2013, (77):
  • [35] Network biology discovers pathogen contact points in host protein-protein interactomes
    Ahmed, Hadia
    Howton, T. C.
    Sun, Yali
    Weinberger, Natascha
    Belkhadir, Youssef
    Mukhtar, M. Shahid
    NATURE COMMUNICATIONS, 2018, 9
  • [36] Network biology discovers pathogen contact points in host protein-protein interactomes
    Hadia Ahmed
    T. C. Howton
    Yali Sun
    Natascha Weinberger
    Youssef Belkhadir
    M. Shahid Mukhtar
    Nature Communications, 9
  • [37] Systems biology of pathogen-host interaction: Networks of protein-protein interaction within pathogens and pathogen-human interactions in the post-genomic era
    Tekir, Saliha Durmus
    Ulgen, Kutlu O.
    BIOTECHNOLOGY JOURNAL, 2013, 8 (01) : 85 - 96
  • [38] Interspecies protein-protein interaction network construction for characterization of host-pathogen interactions: a Candida albicans-zebrafish interaction study
    Wang, Yu-Chao
    Lin, Che
    Chuang, Ming-Ta
    Hsieh, Wen-Ping
    Lan, Chung-Yu
    Chuang, Yung-Jen
    Chen, Bor-Sen
    BMC SYSTEMS BIOLOGY, 2013, 7
  • [39] Extracting protein-protein interaction information from biomedical text with SVM
    Mitsumori, Tomohiro
    Murata, Masaki
    Fukuda, Yasushi
    Doi, Kouichi
    Doi, Hirohumi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (08) : 2464 - 2466
  • [40] Protein-Protein Interaction Extraction from Text by Selecting Linguistic Features
    Thuy Thi Thanh Phan
    Ohkawa, Takenao
    Yamamoto, Akihiro
    2017 IEEE 17TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2017, : 181 - 187