CORRELATION ANALYSIS ALGORITHM FOR MASSIVE ULTRA-HIGH-DIMENSIONAL BREAST ULTRASOUND RADIOMICS FEATURE DATA IN A DISTRIBUTED ENVIRONMENT

被引:0
|
作者
Tang, Yuehong [1 ,2 ]
Chen, Yan [3 ]
Liu, Wen [4 ,5 ]
Gu, Zheng [4 ]
Yao, Hui [5 ]
机构
[1] Xinjiang Med Univ, Tumor Hosp, Urumqi, Xinjiang, Peoples R China
[2] Xinjiang Med Univ, Sch Publ Hlth, Urumqi, Xinjiang, Peoples R China
[3] Jiaxing Univ, Med Sch, Jiaxing, Zhejiang, Peoples R China
[4] Xinjiang Inst Engn, Artificial Intelligence & Smart Mine Engn Technol, Urumqi, Peoples R China
[5] Xinjiang Changsen Data Technol Co Ltd, Urumqi 830011, Peoples R China
关键词
Radiomics; massive high-dimensional data; correlation analysis; distributed computing;
D O I
10.31577/cai_2024_3_756
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Radiomics is a technology that extracts a large number of quantitative features from high -throughput medical images and has become a focus of research. It can help in disease diagnosis, therapy planning, and prognosis evaluation through Big Data analysis algorithms. Radiomics technology can extract hundreds or even tens of thousands of quantifiable data features from medical images, which can no longer fit into the memory of one machine. Therefore, we propose a distributed correlation analysis algorithm (DFCA) based on a MapReduce distributed computing framework for breast ultrasound radiomics feature datasets. Each compute node will produce massive intermediate data while the DFCA calculates the Pearson correlation coefficient of radiomics features. With the increase of feature data and dimensions, the data transmission cost will be in a square growth. To reduce the cost, we propose a distributed correlation estimation algorithm (DFCEA) for radiomics features based on DFCA. The DFCEA algorithm estimates the Pearson correlation coefficient using an iterative method, which can further reduce the I/O cost. The experiment proved that our algorithms are more effective compared to the algorithms in the literature.
引用
收藏
页码:756 / 776
页数:21
相关论文
共 13 条
  • [1] Feature screening for ultra-high-dimensional data via multiscale graph correlation
    Deng, Luojia
    Wu, Jinhai
    Zhang, Bin
    Zhang, Yue
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2024, 53 (22) : 7942 - 7979
  • [2] Ultra-high-dimensional feature screening of binary categorical response data based on Jensen-Shannon divergence
    Jiang, Qingqing
    Deng, Guangming
    AIMS MATHEMATICS, 2024, 9 (02): : 2874 - 2907
  • [3] Triple-negative Breast Cancer Classification Algorithm based on High-dimensional Big Data of Breast Ultrasound Radiomics
    Liu W.
    Yao H.
    Leng X.
    Xue Y.
    Ma B.
    Journal of Engineering Science and Technology Review, 2023, 16 (01) : 85 - 91
  • [4] Robust feature screening for ultra-high dimensional right censored data via distance correlation
    Chen, Xiaolin
    Chen, Xiaojing
    Wang, Hong
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2018, 119 : 118 - 138
  • [5] Synergistic feature selection and distributed classification framework for high-dimensional medical data analysis
    Dhinakaran, D.
    Srinivasan, L.
    Raja, S. Edwin
    Valarmathi, K.
    Nayagam, M. Gomathy
    METHODSX, 2025, 14
  • [6] Incomplete high-dimensional data imputation algorithm using feature selection and clustering analysis on cloud
    Fanyu Bu
    Zhikui Chen
    Qingchen Zhang
    Laurence T. Yang
    The Journal of Supercomputing, 2016, 72 : 2977 - 2990
  • [7] Genetic Algorithm Based Wrapper Feature Selection on Hybrid Prediction Model for Analysis of High Dimensional Data
    Anirudha, R. C.
    Kannan, Remya
    Patil, Nagamma
    2014 9TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2014, : 290 - 295
  • [8] Incomplete high-dimensional data imputation algorithm using feature selection and clustering analysis on cloud
    Bu, Fanyu
    Chen, Zhikui
    Zhang, Qingchen
    Yang, Laurence T.
    JOURNAL OF SUPERCOMPUTING, 2016, 72 (08): : 2977 - 2990
  • [9] An Effective Neural Learning Algorithm for Extracting Cross-Correlation Feature Between Two High-Dimensional Data Streams
    Xiang-yu Kong
    Hong-guang Ma
    Qiu-sheng An
    Qi Zhang
    Neural Processing Letters, 2015, 42 : 459 - 477
  • [10] An Effective Neural Learning Algorithm for Extracting Cross-Correlation Feature Between Two High-Dimensional Data Streams
    Kong, Xiang-yu
    Ma, Hong-guang
    An, Qiu-sheng
    Zhang, Qi
    NEURAL PROCESSING LETTERS, 2015, 42 (02) : 459 - 477