PMLB v1.0: an open-source dataset collection for benchmarking machine learning methods

被引:14
|
作者
Romano, Joseph D. [1 ,2 ]
Le, Trang T. [1 ]
La Cava, William [1 ]
Gregg, John T. [1 ]
Goldberg, Daniel J. [3 ]
Chakraborty, Praneel [4 ,5 ]
Ray, Natasha L. [6 ]
Himmelstein, Daniel [7 ,8 ]
Fu, Weixuan [1 ]
Moore, Jason H. [1 ]
机构
[1] Univ Penn, Inst Biomed Informat, Philadelphia, PA 19104 USA
[2] Univ Penn, Ctr Excellence Environm Toxicol, Philadelphia, PA 19104 USA
[3] Washington Univ, Dept Comp Sci & Engn, St Louis, MO 63130 USA
[4] Univ Penn, Sch Arts & Sci, Philadelphia, PA 19104 USA
[5] Univ Penn, Wharton Sch, Philadelphia, PA 19104 USA
[6] Princeton Day Sch, Princeton, NJ 08540 USA
[7] Related Sci, Denver, CO 80220 USA
[8] Univ Penn, Dept Syst Pharmacol & Translat Therapeut, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/btab727
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Novel machine learning and statistical modeling studies rely on standardized comparisons to existing methods using well-studied benchmark datasets. Few tools exist that provide rapid access to many of these datasets through a standardized, user-friendly interface that integrates well with popular data science workflows. Results: This release of PMLB (Penn Machine Learning Benchmarks) provides the largest collection of diverse, public benchmark datasets for evaluating new machine learning and data science methods aggregated in one location. v1.0 introduces a number of critical improvements developed following discussions with the open-source community.
引用
收藏
页码:878 / 880
页数:3
相关论文
共 50 条
  • [1] Open-source modular solutions for flexural isostasy: gFlex v1.0
    Wickert, A. D.
    GEOSCIENTIFIC MODEL DEVELOPMENT, 2016, 9 (03) : 997 - 1017
  • [2] TrolleyMod v1.0: An Open-Source Simulation and Data Collection Platform for Ethical Decision-Making in Autonomous Vehicles
    Behzadan, Vahid
    Minton, James
    Munir, Arslan
    AIES '19: PROCEEDINGS OF THE 2019 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, 2019, : 391 - 395
  • [3] RASCAL v1.0: an open-source tool for climatological time series reconstruction and extension
    Gonzalez-Cervera, Alvaro
    Duran, Luis
    GEOSCIENTIFIC MODEL DEVELOPMENT, 2024, 17 (19) : 7245 - 7261
  • [4] The ALIGN Open-Source Analog Layout Generator: v1.0 and Beyond (Invited talk)
    Dhar, Tonmoy
    Kunal, Kishor
    Li, Yaguang
    Lin, Yishuang
    Madhusudan, Meghna
    Poojary, Jitesh
    Sharma, Arvind K.
    Burns, Steven M.
    Harjani, Ramesh
    Hu, Jiang
    Mukherjee, Parijat
    Yaldiz, Soner
    Sapatnekar, Sachin S.
    2020 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED-DESIGN (ICCAD), 2020,
  • [5] openAMUNDSEN v1.0: an open-source snow-hydrological model for mountain regions
    Strasser, Ulrich
    Warscher, Michael
    Rottler, Erwin
    Hanzer, Florian
    GEOSCIENTIFIC MODEL DEVELOPMENT, 2024, 17 (17) : 6775 - 6797
  • [6] Pysteps: an open-source Python']Python library for probabilistic precipitation nowcasting (v1.0)
    Pulkkinen, Seppo
    Nerini, Daniele
    Hortal, Andres A. Perez
    Velasco-Forero, Carlos
    Seed, Alan
    Germann, Urs
    Foresti, Loris
    GEOSCIENTIFIC MODEL DEVELOPMENT, 2019, 12 (10) : 4185 - 4219
  • [7] AgriFireInfo v1.0: An Open-Source Platform for the Monitoring and Management of Open-Field Crop Residue Burning
    Yang, Guangyi
    Zhang, Xuelei
    Xiu, Aijun
    Gao, Chao
    Zhang, Mengduo
    Tong, Qingqing
    Liu, Wei
    Yu, Yang
    Zhao, Hongmei
    Zhang, Shichun
    Xie, Shengjin
    FIRE-SWITZERLAND, 2024, 7 (03):
  • [8] UManSysProp v1.0: an online and open-source facility for molecular property prediction and atmospheric aerosol calculations
    Topping, David
    Barley, Mark
    Bane, Michael K.
    Higham, Nicholas
    Aumont, Bernard
    Dingle, Nicholas
    McFiggans, Gordon
    GEOSCIENTIFIC MODEL DEVELOPMENT, 2016, 9 (02) : 899 - 914
  • [9] MSiReader v1.0: Evolving Open-Source Mass Spectrometry Imaging Software for Targeted and Untargeted Analyses
    Bokhart, Mark T.
    Nazari, Milad
    Garrard, Kenneth P.
    Muddiman, David C.
    JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2018, 29 (01) : 8 - 16
  • [10] DeepOtolith v1.0: An Open-Source AI Platform for Automating Fish Age Reading from Otolith or Scale Images
    Politikos, Dimitris V.
    Sykiniotis, Nikolaos
    Petasis, Georgios
    Dedousis, Pavlos
    Ordonez, Alba
    Vabo, Rune
    Anastasopoulou, Aikaterini
    Moen, Endre
    Mytilineou, Chryssi
    Salberg, Arnt-Borre
    Chatzispyrou, Archontia
    Malde, Ketil
    FISHES, 2022, 7 (03)