Benchmarking and scalability of machine-learning methods for photometric redshift estimation

被引:23
|
作者
Henghes, Ben [1 ]
Pettitt, Connor [2 ]
Thiyagalingam, Jeyan [2 ]
Hey, Tony [2 ]
Lahav, Ofer [1 ]
机构
[1] UCL, Dept Phys & Astron, Gower St, London WC1E 6BT, England
[2] Rutherford Appleton Lab, Sci Comp Dept, Sci & Technol Facil Council STFC, Harwell Campus, Didcot OX11 0QX, Oxon, England
基金
美国国家科学基金会; 英国科学技术设施理事会; 美国安德鲁·梅隆基金会; 欧洲研究理事会;
关键词
methods: data analysis; galaxies: distances and redshifts; cosmology: observations; DIGITAL SKY SURVEY;
D O I
10.1093/mnras/stab1513
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
Obtaining accurate photometric redshift (photo-z) estimations is an important aspect of cosmology, remaining a prerequisite of many analyses. In creating novel methods to produce photo-z estimations, there has been a shift towards using machine-learning techniques. However, there has not been as much of a focus on how well different machine-learning methods scale or perform with the ever-increasing amounts of data being produced. Here, we introduce a benchmark designed to analyse the performance and scalability of different supervised machine-learning methods for photo-z estimation. Making use of the Sloan Digital Sky Survey (SDSS - DR12) data set, we analysed a variety of the most used machine-learning algorithms. By scaling the number of galaxies used to train and test the algorithms up to one million, we obtained several metrics demonstrating the algorithms' performance and scalability for this task. Furthermore, by introducing a new optimization method, time-considered optimization, we were able to demonstrate how a small concession of error can allow for a great improvement in efficiency. From the algorithms tested, we found that the Random Forest performed best with a mean squared error, MSE = 0.0042; however, as other algorithms such as Boosted Decision Trees and k-Nearest Neighbours performed very similarly, we used our benchmarks to demonstrate how different algorithms could be superior in different scenarios. We believe that benchmarks like this will become essential with upcoming surveys, such as the Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST), which will capture billions of galaxies requiring photometric redshifts.
引用
收藏
页码:4847 / 4856
页数:10
相关论文
共 50 条
  • [41] Summit: Benchmarking Machine Learning Methods for Reaction Optimisation
    Felton, Kobi C.
    Rittig, Jan G.
    Lapkin, Alexei A.
    CHEMISTRYMETHODS, 2021, 1 (02): : 116 - 122
  • [42] Machine-learning enhanced photometric analysis of the extremely bright GRB 210822A
    Angulo-Valdez, Camila
    Becerra, Rosa L.
    Pereyra, Margarita
    Garcia-Cifuentes, Keneth
    Vargas, Felipe
    Watson, Alan M.
    De Colle, Fabio
    Fraija, Nissim
    Butler, Nathaniel R.
    Dainotti, Maria G.
    Dichiara, Simone
    Lee, William H.
    Troja, Eleonora
    Bloom, Joshua S.
    Jesus Gonzalez, J.
    Kutyrev, Alexander S.
    Prochaska, J. Xavier
    Ramirez-Ruiz, Enrico
    Richer, Michael G.
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2023, 527 (03) : 8140 - 8150
  • [43] Machine-learning enhanced photometric analysis of the extremely bright GRB 210822A
    Angulo-Valdez, Camila
    Becerra, Rosa L.
    Pereyra, Margarita
    Garcia-Cifuentes, Keneth
    Vargas, Felipe
    Watson, Alan M.
    De Colle, Fabio
    Fraija, Nissim
    Butler, Nathaniel R.
    Dainotti, Maria G.
    Dichiara, Simone
    Lee, William H.
    Troja, Eleonora
    Bloom, Joshua S.
    Jesus Gonzalez, J.
    Kutyrev, Alexander S.
    Prochaska, J. Xavier
    Ramirez-Ruiz, Enrico
    Richer, Michael G.
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2024, 527 (03) : 8140 - 8150
  • [44] Travelling through levels of resolution with machine-learning methods
    Lemke, Tobias
    Hunkler, Simon
    Kukharenko, Oleksandra
    Peter, Christine
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 257
  • [45] Review of Medical Decision Support and Machine-Learning Methods
    Awaysheh, Abdullah
    Wilcke, Jeffrey
    Elvinger, Francois
    Rees, Loren
    Fan, Weiguo
    Zimmerman, Kurt L.
    VETERINARY PATHOLOGY, 2019, 56 (04) : 512 - 525
  • [46] MEMS Accelerometers Classification Using Machine-Learning Methods
    Nevlydov, Igor
    Ponomaryova, Ganna
    Miliutina, Svitlana
    Bortnikova, Viktoriia
    2017 XIIITH INTERNATIONAL CONFERENCE ON PERSPECTIVE TECHNOLOGIES AND METHODS IN MEMS DESIGN (MEMSTECH), 2017, : 51 - 55
  • [47] Machine-learning approaches in drug discovery: methods and applications
    Lavecchia, Antonio
    DRUG DISCOVERY TODAY, 2015, 20 (03) : 318 - 331
  • [48] Prediction of Hemolytic Toxicity for Saponins by Machine-Learning Methods
    Zheng, Suqing
    Wang, Yibing
    Liu, Hongmei
    Chang, Wenping
    Xu, Yong
    Lin, Fu
    CHEMICAL RESEARCH IN TOXICOLOGY, 2019, 32 (06) : 1014 - 1026
  • [49] An evaluation of machine-learning methods for predicting pneumonia mortality
    Cooper, GF
    Aliferis, CF
    Ambrosino, R
    Aronis, J
    Buchanan, BG
    Caruana, R
    Fine, MJ
    Glymour, C
    Gordon, G
    Hanusa, BH
    Janosky, JE
    Meek, C
    Mitchell, T
    Richardson, T
    Spirtes, P
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 1997, 9 (02) : 107 - 138
  • [50] Assessment of XCMS Optimization Methods with Machine-Learning Performance
    Lassen, Johan
    Nielsen, Kirstine Lykke
    Johannsen, Mogens
    Villesen, Palle
    ANALYTICAL CHEMISTRY, 2021, 93 (40) : 13459 - 13466