An Efficient Vectorization Approach to Nested Thread-level Parallelism for CUDA GPUs

被引:0
|
作者
Xu, Shixiong [1 ]
Gregg, David [2 ]
机构
[1] Univ Dublin, Trinity Coll Dublin, Sch Comp Sci & Stat, Software Tools Grp, Dublin, Ireland
[2] Lero Irish Software Engn Res Ctr, Copenhagen, Denmark
关键词
D O I
10.1109/PACT.2015.56
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
引用
收藏
页码:488 / 489
页数:2
相关论文
共 50 条
  • [41] CapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs
    Su, Jiya
    Zhang, Feng
    Liu, Weifeng
    He, Bingsheng
    Wu, Ruofan
    Du, Xiaoyong
    Wang, Rujia
    PROCEEDINGS OF THE 49TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2020, 2020,
  • [42] Enhancing Thread-Level Parallelism in Asymmetric Multicores using Transparent Instruction Offloading
    Souza, Jeckson Dellagostin
    Manivannan, Madhavan
    Pericas, Miguel
    Schneider Beck, Antonio Carlos
    PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
  • [43] Exploiting thread-level parallelism in lockstep execution by partially duplicating a single pipeline
    Oh, Jaegeun
    Hwang, Seok Joong
    Nguyen, Huong Giang
    Kim, Areum
    Kim, Seon Wook
    Kim, Chulwoo
    Kim, Jong-Kook
    ETRI JOURNAL, 2008, 30 (04) : 576 - 586
  • [44] Exploiting Thread-Level Parallelism in Functional Self-Testing of CMT Processors
    Apostolakis, Andreas
    Psarakis, Mihalis
    Gizopoulos, Dimitris
    Paschalis, Antonis
    Parulkar, Ishwar
    ETS 2009: EUROPEAN TEST SYMPOSIUM, PROCEEDINGS, 2009, : 33 - +
  • [45] Research on optimization techniques for thread-level parallelism implementation targeting the GCC compiler
    Chen, Mengyao
    Yan, Pengyan
    Han, Lin
    Li, Haoran
    Wang, Cuixia
    Proceedings of SPIE - The International Society for Optical Engineering, 13442
  • [46] Predicting loop termination to boost speculative thread-level parallelism in embedded applications
    Islam, Mafijul Md.
    19TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, PROCEEDINGS, 2007, : 54 - 61
  • [47] OpenPro : A Dynamic Profiling Tool Set for Exploring Thread-Level Speculation Parallelism
    Wang, Yaobin
    An, Hong
    Liang, Bo
    Wang, Li
    Guo, Rui
    ICCEE 2008: PROCEEDINGS OF THE 2008 INTERNATIONAL CONFERENCE ON COMPUTER AND ELECTRICAL ENGINEERING, 2008, : 256 - +
  • [48] A Stall-Aware Warp Scheduling for Dynamically Optimizing Thread-level Parallelism in GPGPUs
    Yu, Yulong
    Xiao, Weijun
    He, Xubin
    Guo, He
    Wang, Yuxin
    Chen, Xin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 15 - 24
  • [49] Exploring Thread-level Parallelism Based on Cost-Driven Model for Irregular Programs
    Li, Yuancheng
    Liu, Bin
    2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (ICSPCC), 2017,
  • [50] Optimizing Dynamic Programming on Graphics Processing Units via Adaptive Thread-Level Parallelism
    Wu, Chao-Chin
    Ke, Jenn-Yang
    Lin, Heshan
    Feng, Wu-chun
    2011 IEEE 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2011, : 96 - 103