Heterogeneous many-core optimization for Monte Carlo path-tracing on new generation Sunway HPC system

被引:0
|
作者
Wang, Xinjie [1 ]
Ma, Guanghao [2 ,3 ]
Song, Jiaying [4 ]
Geng, Mingyao [4 ]
Hu, Wenhui [4 ]
Duan, Xi [1 ]
Wang, Zhigang [1 ]
Xu, Jiali [2 ]
Jin, Xiaogang [5 ]
Li, Fang [6 ]
Chen, Dexun [6 ,7 ]
Yu, Maoxue [2 ]
机构
[1] Ocean Univ China, Coll Comp Sci & Technol, Qingdao 266100, Peoples R China
[2] Laoshan Lab, Qingdao 266237, Peoples R China
[3] Qingdao Marine Sci & Technol Ctr, Lab Reg Oceanog & Numer Modeling, Qingdao 266237, Peoples R China
[4] Gosci Technol Grp, Qingdao 266237, Peoples R China
[5] Zhejiang Univ, State Key Lab CAD & CG, Hangzhou 310058, Zhejiang, Peoples R China
[6] Natl Supercomp Ctr, Wuxi 214072, Peoples R China
[7] Tsinghua Univ, Beijing 100080, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
TaihuLight; Sunway supercomputer; HPC system; Monte Carlo path-tracing; Parallel rendering framework; NONLOCAL MEANS; ARCHITECTURE;
D O I
10.1007/s42514-024-00196-w
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present swRender, a new parallel rendering pipeline based on the new Sunway many-core architecture (SW26010P) for the Monte Carlo path-tracing algorithm. Previous parallel rendering schemes are unsuitable for our task due to issues such as vast differences in hardware architectures and bottlenecks in I/O communication efficiency. To that end, we create a new two-level parallel tile rendering framework to fully utilize the Sunway computing resources, a practical tile-grouping load-balancing method to maintain the framework's stability, and a novel many-core acceleration optimization to improve the rendering performance at the pixel level. Our method achieves (1) an average speedup of 16x in multiple benchmarks when compared to the baseline path-tracing model on the Sunway architecture, and (2) an average speedup of 2x when compared to state-of-the-art CPU, co-processor, and GPU-based parallel rendering approaches. Moreover, we scale swRender to run on 15 million cores and obtain high scalable parallel efficiency of 92%.
引用
收藏
页码:566 / 587
页数:22
相关论文
共 6 条
  • [1] Heterogeneous many-core optimization for Monte Carlo path-tracing on new generation Sunway HPC systemHeterogeneous many-core optimization for Monte Carlo path-tracing on new generation Sunway...X. Wang et al.
    Xinjie Wang
    Guanghao Ma
    Jiaying Song
    Mingyao Geng
    Wenhui Hu
    Xi Duan
    Zhigang Wang
    Jiali Xu
    Xiaogang Jin
    Fang Li
    Dexun Chen
    Maoxue Yu
    CCF Transactions on High Performance Computing, 2024, 6 (6) : 566 - 587
  • [2] Implementing molecular dynamics simulation on the Sunway TaihuLight system with heterogeneous many-core processors
    Dong, Wenqian
    Li, Kenli
    Kang, Letian
    Quan, Zhe
    Li, Keqin
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (16):
  • [3] Implementation and optimization of a data protecting model on the Sunway TaihuLight supercomputer with heterogeneous many-core processors
    Chen, Yuedan
    Li, Kenli
    Fei, Xiongwei
    Quan, Zhe
    Li, Keqin
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (21):
  • [4] A Dataflow Computing System for New Generation of Domestic Heterogeneous Many-Core Processors
    Xiao Q.
    Zhao M.
    Li M.
    Shen L.
    Chen J.
    Zhou W.
    Wang F.
    An H.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2023, 60 (10): : 2405 - 2417
  • [5] Coupled Incomplete Cholesky and Jacobi Preconditioned Conjugate Gradient on the New Generation of Sunway Many-Core Architecture
    Ye, Yuejin
    Guo, Heng
    Wang, Bingzhuo
    Wang, Pengxiao
    Chen, Dexun
    Li, Fang
    IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (11) : 3326 - 3339
  • [6] Parallel Implementation and Optimization of Regional Ocean Modeling System (ROMS) Based on Sunway SW26010 Many-Core Processor
    Liu, Tao
    Zhuang, Yuan
    Tian, Min
    Pan, Jingshan
    Zeng, Yunhui
    Guo, Ying
    Yang, Meihong
    IEEE ACCESS, 2019, 7 : 146170 - 146182