PMLB v1.0: an open-source dataset collection for benchmarking machine learning methods

被引:14
|
作者
Romano, Joseph D. [1 ,2 ]
Le, Trang T. [1 ]
La Cava, William [1 ]
Gregg, John T. [1 ]
Goldberg, Daniel J. [3 ]
Chakraborty, Praneel [4 ,5 ]
Ray, Natasha L. [6 ]
Himmelstein, Daniel [7 ,8 ]
Fu, Weixuan [1 ]
Moore, Jason H. [1 ]
机构
[1] Univ Penn, Inst Biomed Informat, Philadelphia, PA 19104 USA
[2] Univ Penn, Ctr Excellence Environm Toxicol, Philadelphia, PA 19104 USA
[3] Washington Univ, Dept Comp Sci & Engn, St Louis, MO 63130 USA
[4] Univ Penn, Sch Arts & Sci, Philadelphia, PA 19104 USA
[5] Univ Penn, Wharton Sch, Philadelphia, PA 19104 USA
[6] Princeton Day Sch, Princeton, NJ 08540 USA
[7] Related Sci, Denver, CO 80220 USA
[8] Univ Penn, Dept Syst Pharmacol & Translat Therapeut, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/btab727
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Novel machine learning and statistical modeling studies rely on standardized comparisons to existing methods using well-studied benchmark datasets. Few tools exist that provide rapid access to many of these datasets through a standardized, user-friendly interface that integrates well with popular data science workflows. Results: This release of PMLB (Penn Machine Learning Benchmarks) provides the largest collection of diverse, public benchmark datasets for evaluating new machine learning and data science methods aggregated in one location. v1.0 introduces a number of critical improvements developed following discussions with the open-source community.
引用
收藏
页码:878 / 880
页数:3
相关论文
共 50 条
  • [21] Open-Source Machine Learning in Computational Chemistry
    Hagg, Alexander
    Kirschner, Karl N.
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 63 (15) : 4505 - 4532
  • [22] pyStudio: An Open-Source Machine Learning Platform
    Gomicia-Murcia, Enrique
    Bordel Sanchez, Borja
    Souissi, Riad
    AL-Qurishi, Muhammad
    PROCEEDINGS OF THE 2023 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING, ASONAM 2023, 2023, : 436 - 440
  • [23] ORTEGA v1.0: an open-source Python']Python package for context-aware interaction analysis using movement data
    Su, Rongxiang
    Liu, Yifei
    Dodge, Somayeh
    MOVEMENT ECOLOGY, 2024, 12 (01)
  • [24] AI4Water v1.0: an open-source python']python package for modeling hydrological time series using data-driven methods
    Abbas, Ather
    Boithias, Laurie
    Pachepsky, Yakov
    Kim, Kyunghyun
    Chun, Jong Ahn
    Cho, Kyung Hwa
    GEOSCIENTIFIC MODEL DEVELOPMENT, 2022, 15 (07) : 3021 - 3039
  • [25] Hong Kong UrbanNav: An Open-Source Multisensory Dataset for Benchmarking Urban Navigation Algorithms
    Hsu, Li-Ta
    Huang, Feng
    Ng, Hoi-Fung
    Zhang, Guohao
    Zhong, Yihan
    Bai, Xiwei
    Wen, Weisong
    NAVIGATION-JOURNAL OF THE INSTITUTE OF NAVIGATION, 2023, 70 (04):
  • [26] ABOT: an open-source online benchmarking tool for machine learning-based artefact detection and removal methods from neuronal signals
    Fabietti, Marcos
    Mahmud, Mufti
    Lotfi, Ahmad
    Kaiser, M. Shamim
    BRAIN INFORMATICS, 2022, 9 (01)
  • [27] Open-source machine learning: R meets Weka
    Hornik, Kurt
    Buchta, Christian
    Zeileis, Achim
    COMPUTATIONAL STATISTICS, 2009, 24 (02) : 225 - 232
  • [28] Open-source machine learning: R meets Weka
    Kurt Hornik
    Christian Buchta
    Achim Zeileis
    Computational Statistics, 2009, 24 : 225 - 232
  • [29] AirSensor v1.0: Enhancements to the open-source R package to enable deep understanding of the long-term performance and reliability of PurpleAir sensors
    Collier-Oxandale, Ashley
    Feenstra, Brandon
    Papapostolou, Vasileios
    Polidori, Andrea
    ENVIRONMENTAL MODELLING & SOFTWARE, 2022, 148
  • [30] Open-Source Machine Learning Tool for Craniofacial Photo Recognition
    Nahass, George
    Peterson, Jeffrey C.
    Khandwala, Nikki
    Heinze, Kevin
    Choudhary, Akriti
    Purnell, Chad A.
    Tran, Ann Q.
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2023, 64 (08)