Pathway activity inference for multiclass disease classification through a mathematical programming optimisation framework

被引:8
|
作者
Yang, Lingjian [1 ]
Ainali, Chrysanthi [2 ]
Tsoka, Sophia [2 ]
Papageorgiou, Lazaros G. [1 ]
机构
[1] UCL, Ctr Proc Syst Engn, Dept Chem Engn, London WC1E 7JE, England
[2] Kings Coll London, Sch Nat & Math Sci, Dept Informat, London WC2R 2LS, England
来源
BMC BIOINFORMATICS | 2014年 / 15卷
基金
英国工程与自然科学研究理事会;
关键词
Disease classification; Microarray; Pathway activity; Mathematical programming; Optimisation; BREAST-CANCER PATIENTS; BIOMARKER IDENTIFICATION; EXPRESSION PROFILES; FEATURE-SELECTION; GENE; MICROARRAY; ALGORITHM; METASTASIS; PREDICTION; SIGNATURE;
D O I
10.1186/s12859-014-0390-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Applying machine learning methods on microarray gene expression profiles for disease classification problems is a popular method to derive biomarkers, i.e. sets of genes that can predict disease state or outcome. Traditional approaches where expression of genes were treated independently suffer from low prediction accuracy and difficulty of biological interpretation. Current research efforts focus on integrating information on protein interactions through biochemical pathway datasets with expression profiles to propose pathway-based classifiers that can enhance disease diagnosis and prognosis. As most of the pathway activity inference methods in literature are either unsupervised or applied on two-class datasets, there is good scope to address such limitations by proposing novel methodologies. Results: A supervised multiclass pathway activity inference method using optimisation techniques is reported. For each pathway expression dataset, patterns of its constituent genes are summarised into one composite feature, termed pathway activity, and a novel mathematical programming model is proposed to infer this feature as a weighted linear summation of expression of its constituent genes. Gene weights are determined by the optimisation model, in a way that the resulting pathway activity has the optimal discriminative power with regards to disease phenotypes. Classification is then performed on the resulting low-dimensional pathway activity profile. Conclusions: The model was evaluated through a variety of published gene expression profiles that cover different types of disease. We show that not only does it improve classification accuracy, but it can also perform well in multiclass disease datasets, a limitation of other approaches from the literature. Desirable features of the model include the ability to control the maximum number of genes that may participate in determining pathway activity, which may be pre-specified by the user. Overall, this work highlights the potential of building pathway-based multi-phenotype classifiers for accurate disease diagnosis and prognosis problems.
引用
收藏
页数:14
相关论文
共 29 条
  • [1] Pathway activity inference for multiclass disease classification through a mathematical programming optimisation framework
    Lingjian Yang
    Chrysanthi Ainali
    Sophia Tsoka
    Lazaros G Papageorgiou
    BMC Bioinformatics, 15
  • [2] Optimisation Models for Pathway Activity Inference in Cancer
    Chen, Yongnan
    Liu, Songsong
    Papageorgiou, Lazaros G.
    Theofilatos, Konstantinos
    Tsoka, Sophia
    CANCERS, 2023, 15 (06)
  • [3] Disease Classification through Integer Optimisation
    Ainali, Chrysanthi
    Nestle, Frank
    Papageorgiou, Lazaros G.
    Tsoka, Sophia
    21ST EUROPEAN SYMPOSIUM ON COMPUTER AIDED PROCESS ENGINEERING, 2011, 29 : 1548 - 1552
  • [4] Classification and disease prediction via mathematical programming
    Lee, Eva K.
    Wu, Tsung-Lin
    DATA MINING, SYSTEMS ANALYSIS, AND OPTIMIZATION IN BIOMEDICINE, 2007, 953 : 1 - +
  • [5] Mathematical programming based heuristics for improving LP-generated classifiers for the multiclass supervised classification problem
    Adem, J
    Gochet, W
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2006, 168 (01) : 181 - 199
  • [6] Inference of brain pathway activities for Alzheimer's disease classification
    Jongan Lee
    Younghoon Kim
    Yong Jeong
    Duk L Na
    Jong-Won Kim
    Kwang H Lee
    Doheon Lee
    BMC Medical Informatics and Decision Making, 15
  • [7] Inference of brain pathway activities for Alzheimer's disease classification
    Lee, Jongan
    Kim, Younghoon
    Jeong, Yong
    Na, Duk L.
    Kim, Jong-Won
    Lee, Kwang H.
    Lee, Doheon
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2015, 15
  • [8] Accurate and Reliable Cancer Classification Based on Probabilistic Inference of Pathway Activity
    Su, Junjie
    Yoon, Byung-Jun
    Dougherty, Edward R.
    PLOS ONE, 2009, 4 (12):
  • [9] Resource allocation in ordinal classification problems: A prescriptive framework utilizing machine learning and mathematical programming
    Rabkin, Lior
    Cohen, Ilan
    Singer, Gonen
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
  • [10] Inferring Pathway Activity toward Precise Disease Classification
    Lee, Eunjung
    Chuang, Han-Yu
    Kim, Jong-Won
    Ideker, Trey
    Lee, Doheon
    PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (11)