Strategies to enable large-scale proteomics for reproducible research

被引:81
|
作者
Poulos, Rebecca C. [1 ]
Hains, Peter G. [1 ]
Shah, Rohan [1 ]
Lucas, Natasha [1 ]
Xavier, Dylan [1 ]
Manda, Srikanth S. [1 ]
Anees, Asim [1 ]
Koh, Jennifer M. S. [1 ]
Mahboob, Sadia [1 ]
Wittman, Max [1 ]
Williams, Steven G. [1 ]
Sykes, Erin K. [1 ]
Hecker, Michael [1 ]
Dausmann, Michael [1 ]
Wouters, Merridee A. [1 ]
Ashman, Keith [2 ]
Yang, Jean [3 ]
Wild, Peter J. [4 ,5 ]
deFazio, Anna [6 ,7 ,8 ]
Balleine, Rosemary L. [1 ]
Tully, Brett [1 ]
Aebersold, Ruedi [9 ,10 ]
Speed, Terence P. [11 ,12 ]
Liu, Yansheng [13 ,14 ]
Reddel, Roger R. [1 ]
Robinson, Phillip J. [1 ]
Zhong, Qing [1 ]
机构
[1] Univ Sydney, Fac Med & Hlth, Childrens Med Res Inst, ProCan, Westmead, NSW, Australia
[2] SCIEX Ltd, 2 Gilda Court, Mulgrave, Vic, Australia
[3] Univ Sydney, Sch Math & Stat, Sydney, NSW, Australia
[4] Univ Hosp Frankfurt, Dr Senckenberg Inst Pathol, Frankfurt, Germany
[5] Univ Hosp Zurich, Dept Pathol & Mol Pathol, Zurich, Switzerland
[6] Westmead Inst Med Res, Ctr Canc Res, Westmead, NSW, Australia
[7] Univ Sydney, Fac Med & Hlth, Westmead, NSW, Australia
[8] Westmead Hosp, Dept Gynaecol Oncol, Westmead, NSW, Australia
[9] Swiss Fed Inst Technol, Inst Mol Syst Biol, Dept Biol, Zurich, Switzerland
[10] Univ Zurich, Fac Sci, Zurich, Switzerland
[11] Walter & Eliza Hall Inst Med Res, Bioinformat Div, Parkville, Vic, Australia
[12] Univ Melbourne, Dept Math & Stat, Melbourne, Vic, Australia
[13] Yale Univ, Sch Med, Dept Pharmacol, New Haven, CT 06510 USA
[14] Yale Univ, Yale Canc Biol Inst, West Haven, CT USA
基金
澳大利亚国家健康与医学研究理事会; 英国医学研究理事会;
关键词
PROTEOGENOMIC CHARACTERIZATION; UNWANTED VARIATION; MASS-SPECTROMETRY; PROTEINS; PEPTIDE; TANDEM;
D O I
10.1038/s41467-020-17641-3
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Reproducible research is the bedrock of experimental science. To enable the deployment of large-scale proteomics, we assess the reproducibility of mass spectrometry (MS) over time and across instruments and develop computational methods for improving quantitative accuracy. We perform 1560 data independent acquisition (DIA)-MS runs of eight samples containing known proportions of ovarian and prostate cancer tissue and yeast, or control HEK293T cells. Replicates are run on six mass spectrometers operating continuously with varying maintenance schedules over four months, interspersed with similar to 5000 other runs. We utilise negative controls and replicates to remove unwanted variation and enhance biological signal, outperforming existing methods. We also design a method for reducing missing values. Integrating these computational modules into a pipeline (ProNorM), we mitigate variation among instruments over time and accurately predict tissue proportions. We demonstrate how to improve the quantitative analysis of large-scale DIA-MS data, providing a pathway toward clinical proteomics.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Strategies to enable large-scale proteomics for reproducible research
    Rebecca C. Poulos
    Peter G. Hains
    Rohan Shah
    Natasha Lucas
    Dylan Xavier
    Srikanth S. Manda
    Asim Anees
    Jennifer M. S. Koh
    Sadia Mahboob
    Max Wittman
    Steven G. Williams
    Erin K. Sykes
    Michael Hecker
    Michael Dausmann
    Merridee A. Wouters
    Keith Ashman
    Jean Yang
    Peter J. Wild
    Anna deFazio
    Rosemary L. Balleine
    Brett Tully
    Ruedi Aebersold
    Terence P. Speed
    Yansheng Liu
    Roger R. Reddel
    Phillip J. Robinson
    Qing Zhong
    Nature Communications, 11
  • [2] Large-scale plant proteomics
    Birgit Kersten
    Lukas Bürkle
    Eckehard J. Kuhn
    Patrick Giavalisco
    Zoltan Konthur
    Angelika Lueking
    Gerald Walter
    Holger Eickhoff
    Ulrich Schneider
    Plant Molecular Biology, 2002, 48 : 133 - 141
  • [3] Large-scale plant proteomics
    Kersten, B
    Bürkle, L
    Kuhn, EJ
    Giavalisco, P
    Konthur, Z
    Lueking, A
    Walter, G
    Eickhoff, H
    Schneider, U
    PLANT MOLECULAR BIOLOGY, 2002, 48 (1-2) : 133 - 141
  • [4] Syne Tune: A Library for Large-Scale Hyperparameter Tuning and Reproducible Research
    Salinas, David
    Seeger, Matthias
    Klein, Aaron
    Perrone, Valerio
    Wistuba, Martin
    Archambeau, Cedric
    INTERNATIONAL CONFERENCE ON AUTOMATED MACHINE LEARNING, VOL 188, 2022, 188
  • [5] The future of large-scale collaborative proteomics
    Dowsey, Andrew W.
    Yang, Guang-Zhong
    PROCEEDINGS OF THE IEEE, 2008, 96 (08) : 1292 - 1309
  • [6] Large-scale proteomics experiments in context
    Mueller, M.
    Hermjakob, H.
    Apweiler, R.
    MOLECULAR & CELLULAR PROTEOMICS, 2006, 5 (10) : S195 - S195
  • [7] Reproducible learning in large-scale graphical models
    Zhou, Jia
    Li, Yang
    Zheng, Zemin
    Li, Daoji
    JOURNAL OF MULTIVARIATE ANALYSIS, 2022, 189
  • [8] Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics
    Deutsch, Eric W.
    Mendoza, Luis
    Shteynberg, David
    Slagel, Joseph
    Sun, Zhi
    Moritz, Robert L.
    PROTEOMICS CLINICAL APPLICATIONS, 2015, 9 (7-8) : 745 - 754
  • [9] Research on evacuation strategies to prevent stampedes in large-scale events
    Wu, Qiongqiong
    Liu, Shangnan
    Zhu, Zhenjiang
    Zhang, Hao
    2016 2ND INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS - COMPUTING TECHNOLOGY, INTELLIGENT TECHNOLOGY, INDUSTRIAL INFORMATION INTEGRATION (ICIICII), 2016, : 158 - 161
  • [10] Research on Precise Feeding Strategies for Large-Scale Marine Aquafarms
    Wang, Yizhi
    Zhang, Yusen
    Ma, Fengyuan
    Tian, Xiaomin
    Ge, Shanshan
    Man, Chaoyuan
    Xiao, Maohua
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2024, 12 (09)