Benchmarking and scalability of machine-learning methods for photometric redshift estimation

被引:23
|
作者
Henghes, Ben [1 ]
Pettitt, Connor [2 ]
Thiyagalingam, Jeyan [2 ]
Hey, Tony [2 ]
Lahav, Ofer [1 ]
机构
[1] UCL, Dept Phys & Astron, Gower St, London WC1E 6BT, England
[2] Rutherford Appleton Lab, Sci Comp Dept, Sci & Technol Facil Council STFC, Harwell Campus, Didcot OX11 0QX, Oxon, England
基金
美国国家科学基金会; 英国科学技术设施理事会; 美国安德鲁·梅隆基金会; 欧洲研究理事会;
关键词
methods: data analysis; galaxies: distances and redshifts; cosmology: observations; DIGITAL SKY SURVEY;
D O I
10.1093/mnras/stab1513
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
Obtaining accurate photometric redshift (photo-z) estimations is an important aspect of cosmology, remaining a prerequisite of many analyses. In creating novel methods to produce photo-z estimations, there has been a shift towards using machine-learning techniques. However, there has not been as much of a focus on how well different machine-learning methods scale or perform with the ever-increasing amounts of data being produced. Here, we introduce a benchmark designed to analyse the performance and scalability of different supervised machine-learning methods for photo-z estimation. Making use of the Sloan Digital Sky Survey (SDSS - DR12) data set, we analysed a variety of the most used machine-learning algorithms. By scaling the number of galaxies used to train and test the algorithms up to one million, we obtained several metrics demonstrating the algorithms' performance and scalability for this task. Furthermore, by introducing a new optimization method, time-considered optimization, we were able to demonstrate how a small concession of error can allow for a great improvement in efficiency. From the algorithms tested, we found that the Random Forest performed best with a mean squared error, MSE = 0.0042; however, as other algorithms such as Boosted Decision Trees and k-Nearest Neighbours performed very similarly, we used our benchmarks to demonstrate how different algorithms could be superior in different scenarios. We believe that benchmarks like this will become essential with upcoming surveys, such as the Vera C. Rubin Observatory's Legacy Survey of Space and Time (LSST), which will capture billions of galaxies requiring photometric redshifts.
引用
收藏
页码:4847 / 4856
页数:10
相关论文
共 50 条
  • [1] ANNz2-Photometric redshift and probability density function estimation using machine-learning
    Sadeh, Iftach
    STATISTICAL CHALLENGES IN 21ST CENTURY COSMOLOGY, 2015, 10 (306): : 316 - 318
  • [2] Machine Learning for Photometric Redshift Estimation of Quasars with Different Samples
    Zhang, Yanxia
    Jin, Xin
    Zhang, Jingyi
    Zhao, Yongheng
    2020 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2020, : 294 - 297
  • [3] Photometric classification of emission line galaxies with machine-learning methods
    Cavuoti, Stefano
    Brescia, Massimo
    D'Abrusco, Raffaele
    Longo, Giuseppe
    Paolillo, Maurizio
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2014, 437 (01) : 968 - 975
  • [4] hayate : photometric redshift estimation by hybridizing machine learning with template fitting
    Tanigawa, Shingo
    Glazebrook, K.
    Jacobs, C.
    Labbe, I
    Qin, A. K.
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2024, 530 (02) : 2012 - 2038
  • [5] Dictionary Learning for Photometric Redshift Estimation
    Frontera-Pons, Joana
    Sureau, Florent
    Moraes, Bruno
    Bobin, Jerome
    Abdalla, Filipe B.
    Starck, Jean-Luc
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1740 - 1744
  • [6] Benchmarking performance of machine-learning methods for building energy demand modelling
    Erdem, Merve Kuru
    Saricicek, Onur
    Calis, Gulben
    PROCEEDINGS OF THE INSTITUTION OF CIVIL ENGINEERS-ENGINEERING SUSTAINABILITY, 2022, 176 (06) : 348 - 359
  • [7] Photometric redshift estimation of BASS DR3 quasars by machine learning
    Li, Changhua
    Zhang, Yanxia
    Cui, Chenzhou
    Fan, Dongwei
    Zhao, Yongheng
    Wu, Xue-Bing
    Zhang, Jing-Yi
    Han, Jun
    Xu, Yunfei
    Tao, Yihan
    Li, Shanshan
    He, Boliang
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2022, 509 (02) : 2289 - 2303
  • [8] Estimation of traffic dynamics models with machine-learning methods
    Antoniou, Constantinos
    Koutsopoulos, Haris N.
    TRAFFIC FLOW THEORY 2006, 2006, (1965): : 103 - 111
  • [9] Improving the reliability of photometric redshift with machine learning
    Razim, Oleksandra
    Cavuoti, Stefano
    Brescia, Massimo
    Riccio, Giuseppe
    Salvato, Mara
    Longo, Giuseppe
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2021, 507 (04) : 5034 - 5052
  • [10] Estimating the redshift of galaxies from their photometric colors using machine learning methods
    Meza-Obando, Felipe
    TECNOLOGIA EN MARCHA, 2020, 33 : 38 - 43