Scientific machine learning benchmarks

被引:61
|
作者
Thiyagalingam, Jeyan [1 ]
Shankar, Mallikarjun [2 ]
Fox, Geoffrey [3 ]
Hey, Tony [1 ]
机构
[1] Sci & Technol Facil Council, Rutherford Appleton Lab, Harwell Campus, Didcot, Oxon, England
[2] Oak Ridge Natl Lab, Oak Ridge, TN USA
[3] Univ Virginia, Comp Sci & Biocomplex Inst, Charlottesville, VA USA
基金
英国工程与自然科学研究理事会;
关键词
40;
D O I
10.1038/s42254-022-00441-7
中图分类号
O59 [应用物理学];
学科分类号
摘要
Finding the most appropriate machine learning algorithm for the analysis of any given scientific dataset is currently challenging, but new machine learning benchmarks for science are being developed to help. Deep learning has transformed the use of machine learning technologies for the analysis of large experimental datasets. In science, such datasets are typically generated by large-scale experimental facilities, and machine learning focuses on the identification of patterns, trends and anomalies to extract meaningful scientific insights from the data. In upcoming experimental facilities, such as the Extreme Photonics Application Centre (EPAC) in the UK or the international Square Kilometre Array (SKA), the rate of data generation and the scale of data volumes will increasingly require the use of more automated data analysis. However, at present, identifying the most appropriate machine learning algorithm for the analysis of any given scientific dataset is a challenge due to the potential applicability of many different machine learning frameworks, computer architectures and machine learning models. Historically, for modelling and simulation on high-performance computing systems, these issues have been addressed through benchmarking computer applications, algorithms and architectures. Extending such a benchmarking approach and identifying metrics for the application of machine learning methods to open, curated scientific datasets is a new challenge for both scientists and computer scientists. Here, we introduce the concept of machine learning benchmarks for science and review existing approaches. As an example, we describe the SciMLBench suite of scientific machine learning benchmarks.
引用
收藏
页码:413 / 420
页数:8
相关论文
共 50 条
  • [41] Forecasting benchmarks of long-term stock returns via machine learning
    Kyriakou, Ioannis
    Mousavi, Parastoo
    Nielsen, Jens Perch
    Scholz, Michael
    ANNALS OF OPERATIONS RESEARCH, 2021, 297 (1-2) : 221 - 240
  • [42] The rise of scientific machine learning: a perspective on combining mechanistic modelling with machine learning for systems biology
    Noordijk, Ben
    Gomez, Monica L. Garcia
    ten Tusscher, Kirsten H. W. J.
    de Ridder, Dick
    van Dijk, Aalt D. J.
    Smith, Robert W.
    FRONTIERS IN SYSTEMS BIOLOGY, 2024, 4
  • [43] Use of Synthetic Benchmarks for Machine-Learning-based Performance Auto-tuning
    Han, Tianyi David
    Abdelrahman, Tarek S.
    2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2017, : 1350 - 1361
  • [44] Machine Learning Benchmarks for the Classification of Equivalent Circuit Models from Electrochemical Impedance Spectra
    Schaeffer, Joachim
    Gasper, Paul
    Garcia-Tamayo, Esteban
    Gasper, Raymond
    Adachi, Masaki
    Gaviria-Cardona, Juan Pablo
    Montoya-Bedoya, Simon
    Bhutani, Anoushka
    Schiek, Andrew
    Goodall, Rhys
    Findeisen, Rolf
    Braatz, Richard D.
    Engelke, Simon
    JOURNAL OF THE ELECTROCHEMICAL SOCIETY, 2023, 170 (06)
  • [45] Generative machine learning with tensor networks: Benchmarks on near-term quantum computers
    Wall, Michael L.
    Abernathy, Matthew R.
    Quiroz, Gregory
    PHYSICAL REVIEW RESEARCH, 2021, 3 (02):
  • [46] From Micro-benchmarks to Machine Learning: Unveiling the Efficiency and Scalability of Hadoop and Spark
    Hebabaze, Salah Eddine
    El Ghmary, Mohamed
    El Bouabidi, Hamid
    Maftah, Sara
    Amnai, Mohamed
    International Journal of Interactive Mobile Technologies, 2024, 18 (17) : 46 - 60
  • [47] Applying machine learning to automatically assess scientific models
    Zhai, Xiaoming
    He, Peng
    Krajcik, Joseph
    JOURNAL OF RESEARCH IN SCIENCE TEACHING, 2022, 59 (10) : 1765 - 1794
  • [48] The limitations of machine learning models for predicting scientific replicability
    Crockett, M. J.
    Bai, Xuechunzi
    Kapoor, Sayash
    Messeri, Lisa
    Narayanan, Arvind
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 120 (33)
  • [49] Fold bifurcation identification through scientific machine learning
    Habib, Giuseppe
    Horvath, Adam
    PHYSICA D-NONLINEAR PHENOMENA, 2025, 472
  • [50] Automated Machine Learning for Information Retrieval in Scientific Articles
    Rakhshani, Hojjat
    Latard, Bastien
    Brevilliers, Mathieu
    Weber, Jonathan
    Lepagnot, Julien
    Forestier, Germain
    Hassenforder, Michel
    Idoumghar, Lhassane
    2020 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2020,