Multi-GPU implementation of a VMAT treatment plan optimization algorithm

被引:10
|
作者
Tian, Zhen [1 ]
Peng, Fei [2 ]
Folkerts, Michael [1 ]
Tan, Jun [1 ]
Jia, Xun [1 ]
Jiang, Steve B. [1 ]
机构
[1] Univ Texas SW Med Ctr Dallas, Dept Radiat Oncol, Dallas, TX 75390 USA
[2] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
关键词
multi-GPU; VMAT optimization; column-generation approach; MODULATED ARC THERAPY; RADIOTHERAPY DOSE CALCULATION; TEMPORAL NONLOCAL MEANS; CONE-BEAM CT; RADIATION-THERAPY; IMRT; DELIVERY; TOMOTHERAPY; RECONSTRUCTION; QUALITY;
D O I
10.1118/1.4919742
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Purpose: Volumetric modulated arc therapy (VMAT) optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units (GPUs) have been used to speed up the computations. However, GPU's relatively small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix in cases of, e.g., those with a large target size, multiple targets, multiple arcs, and/or small beamlet size. The main purpose of this paper is to report an implementation of a column-generation-based VMAT algorithm, previously developed in the authors' group, on a multi-GPU platform to solve the memory limitation problem. While the column-generation-based VMAT algorithm has been previously developed, the GPU implementation details have not been reported. Hence, another purpose is to present detailed techniques employed for GPU implementation. The authors also would like to utilize this particular problem as an example problem to study the feasibility of using a multi-GPU platform to solve large-scale problems in medical physics. Methods: The column-generation approach generates VMAT apertures sequentially by solving a pricing problem (PP) and a master problem (MP) iteratively. In the authors' method, the sparse DDC matrix is first stored on a CPU in coordinate list format (COO). On the GPU side, this matrix is split into four submatrices according to beam angles, which are stored on four GPUs in compressed sparse row format. Computation of beamlet price, the first step in PP, is accomplished using multi-GPUs. A fast inter-GPU data transfer scheme is accomplished using peer-to-peer access. The remaining steps of PP and MP problems are implemented on CPU or a single GPU due to their modest problem scale and computational loads. Barzilai and Borwein algorithm with a subspace step scheme is adopted here to solve the MP problem. A head and neck (H&N) cancer case is then used to validate the authors' method. The authors also compare their multi-GPU implementation with three different single GPU implementation strategies, i.e., truncating DDC matrix (Si), repeatedly transferring DDC matrix between CPU and GPU (S2), and porting computations involving DDC matrix to CPU (S3), in terms of both plan quality and computational efficiency. Two more H&N patient cases and three prostate cases are used to demonstrate the advantages of the authors' method. Results: The authors' multi-GPU implementation can finish the optimization process within similar to 1 min for the H&N patient case. Si leads to an inferior plan quality although its total time was 10 s shorter than the multi-GPU implementation due to the reduced matrix size. S2 and S3 yield the same plan quality as the multi-GPU implementation but take similar to 4 and similar to 6 min, respectively. High computational efficiency was consistently achieved for the other five patient cases tested, with VMAT plans of clinically acceptable quality obtained within 23-46 s. Conversely, to obtain clinically comparable or acceptable plans for all six of these VMAT cases that the authors have tested in this paper, the optimization time needed in a commercial TPS system on CPU was found to be in an order of several minutes. Conclusions: The results demonstrate that the multi-GPU implementation of the authors' column-generation-based VMAT optimization can handle the large-scale VMAT optimization problem efficiently without sacrificing plan quality. The authors' study may serve as an example to shed some light on other large-scale medical physics problems that require multi-GPU techniques. (C) 2015 American Association of Physicists in Medicine.
引用
收藏
页码:2841 / 2852
页数:12
相关论文
共 50 条
  • [1] A Multi-GPU Implementation of a Cellular Genetic Algorithm
    Vidal, Pablo
    Alba, Enrique
    2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [2] MAPREDUCE IMPLEMENTATION WITH MULTI-GPU
    Chen, Yi
    Chen, Su
    Jiang, Hai
    INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE & TECHNOLOGY: PROCEEDINGS, 2012, : 21 - 25
  • [3] Multi-GPU Implementation of k-Nearest Neighbor Algorithm
    Masek, Jan
    Burget, Kadim
    Karasek, Jan
    Uher, Vaclav
    Dutta, Malay Kishore
    2015 38TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2015, : 764 - 767
  • [4] Multi-GPU Implementation of LU Factorization
    Jia, Yulu
    Luszczek, Piotr
    Dongarra, Jack
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2012, 2012, 9 : 106 - 115
  • [5] Efficient Implementation of MrBayes on Multi-GPU
    Bao, Jie
    Xia, Hongju
    Zhou, Jianfu
    Liu, Xiaoguang
    Wang, Gang
    MOLECULAR BIOLOGY AND EVOLUTION, 2013, 30 (06) : 1471 - 1479
  • [6] Multi-GPU-Based VMAT Treatment Plan Optimization Using a Column-Generation Approach
    Tian, Z.
    Peng, F.
    Shi, F.
    Jia, X.
    Jiang, S.
    MEDICAL PHYSICS, 2014, 41 (06) : 316 - 316
  • [7] A multi-GPU parallel optimization model for the preconditioned conjugate gradient algorithm
    Gao, Jiaquan
    Zhou, Yuanshen
    He, Guixia
    Xia, Yifei
    PARALLEL COMPUTING, 2017, 63 : 1 - 16
  • [8] Optimization in the parallelism extraction algorithm with spanning tree on a multi-GPU environment
    Wang, Guyue
    Wada, Koichi
    Yamagiwa, Shinichi
    IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING, 2019, 14 (06) : 862 - 869
  • [9] Scalable multi-GPU implementation of the MAGFLOW simulator
    Rustico, Eugenio
    Bilotta, Giuseppe
    Herault, Alexis
    Del Negro, Ciro
    Gallo, Giovanni
    ANNALS OF GEOPHYSICS, 2011, 54 (05) : 592 - 599
  • [10] Towards a Multi-GPU Implementation of a Seismic Application
    Rigon, Pedro H. C.
    Schussler, Brenda S.
    Padoin, Edson L.
    Lorenzon, Arthur F.
    Carissimi, Alexandre
    Navaux, Philippe O. A.
    HIGH PERFORMANCE COMPUTING, CARLA 2023, 2024, 1887 : 146 - 159