Benchmarking and scalability of machine-learning methods for photometric redshift estimation

被引:23
|
作者
Henghes, Ben [1 ]
Pettitt, Connor [2 ]
Thiyagalingam, Jeyan [2 ]
Hey, Tony [2 ]
Lahav, Ofer [1 ]
机构
[1] UCL, Dept Phys & Astron, Gower St, London WC1E 6BT, England
[2] Rutherford Appleton Lab, Sci Comp Dept, Sci & Technol Facil Council STFC, Harwell Campus, Didcot OX11 0QX, Oxon, England
基金
美国国家科学基金会; 英国科学技术设施理事会; 美国安德鲁·梅隆基金会; 欧洲研究理事会;
关键词
methods: data analysis; galaxies: distances and redshifts; cosmology: observations; DIGITAL SKY SURVEY;
D O I
10.1093/mnras/stab1513
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
Obtaining accurate photometric redshift (photo-z) estimations is an important aspect of cosmology, remaining a prerequisite of many analyses. In creating novel methods to produce photo-z estimations, there has been a shift towards using machine-learning techniques. However, there has not been as much of a focus on how well different machine-learning methods scale or perform with the ever-increasing amounts of data being produced. Here, we introduce a benchmark designed to analyse the performance and scalability of different supervised machine-learning methods for photo-z estimation. Making use of the Sloan Digital Sky Survey (SDSS - DR12) data set, we analysed a variety of the most used machine-learning algorithms. By scaling the number of galaxies used to train and test the algorithms up to one million, we obtained several metrics demonstrating the algorithms' performance and scalability for this task. Furthermore, by introducing a new optimization method, time-considered optimization, we were able to demonstrate how a small concession of error can allow for a great improvement in efficiency. From the algorithms tested, we found that the Random Forest performed best with a mean squared error, MSE = 0.0042; however, as other algorithms such as Boosted Decision Trees and k-Nearest Neighbours performed very similarly, we used our benchmarks to demonstrate how different algorithms could be superior in different scenarios. We believe that benchmarks like this will become essential with upcoming surveys, such as the Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST), which will capture billions of galaxies requiring photometric redshifts.
引用
收藏
页码:4847 / 4856
页数:10
相关论文
共 50 条
  • [11] Photometric Redshift Estimation: An Active Learning Approach
    Vilalta, R.
    Ishida, E. E. O.
    Beck, R.
    Sutrisno, R.
    de Souza, R. S.
    Mahabal, A.
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017,
  • [12] Risk estimation and risk prediction using machine-learning methods
    Kruppa, Jochen
    Ziegler, Andreas
    Koenig, Inke R.
    HUMAN GENETICS, 2012, 131 (10) : 1639 - 1654
  • [13] Risk estimation and risk prediction using machine-learning methods
    Jochen Kruppa
    Andreas Ziegler
    Inke R. König
    Human Genetics, 2012, 131 : 1639 - 1654
  • [14] ANNz2: Photometric Redshift and Probability Distribution Function Estimation using Machine Learning
    Sadeh, I.
    Abdalla, F. B.
    Lahav, O.
    PUBLICATIONS OF THE ASTRONOMICAL SOCIETY OF THE PACIFIC, 2016, 128 (968)
  • [15] Benchmarking machine-learning software and hardware for quantitative economics
    Duarte, Victor
    Duarte, Diogo
    Fonseca, Julia
    Montecinos, Alexis
    JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2020, 111
  • [16] A Photometric Machine-Learning Method to Infer Stellar Metallicity
    Miller, Adam A.
    DATABASES IN NETWORKED INFORMATION SYSTEMS (DNIS 2015), 2015, 8999 : 231 - 236
  • [17] Geochemistry and machine learning: methods and benchmarking
    Prasianakis, N. I.
    Laloy, E.
    Jacques, D.
    Meeussen, J. C. L.
    Miron, G. D.
    Kulik, D. A.
    Idiart, A.
    Demirer, E.
    Coene, E.
    Cochepin, B.
    Leconte, M.
    Savino, M. E.
    Samper-Pilar, J.
    De Lucia, M.
    Churakov, S. V.
    Kolditz, O.
    Yang, C.
    Samper, J.
    Claret, F.
    ENVIRONMENTAL EARTH SCIENCES, 2025, 84 (05)
  • [18] Bayesian photometric redshift estimation
    Benítez, N
    ASTROPHYSICAL JOURNAL, 2000, 536 (02): : 571 - 583
  • [19] Cooperative photometric redshift estimation
    Cavuoti, S.
    Tortora, C.
    Brescia, M.
    Longo, G.
    Radovich, M.
    Napolitano, N. R.
    Amaro, V.
    Vellucci, C.
    ASTROINFORMATICS, 2017, 12 (S325): : 166 - 172
  • [20] Overconfidence in photometric redshift estimation
    Wittman, David
    Bhaskar, Ramya
    Tobin, Ryan
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2016, 457 (04) : 4005 - 4011