An Efficient Vectorization Approach to Nested Thread-level Parallelism for CUDA GPUs

被引:0
|
作者
Xu, Shixiong [1 ]
Gregg, David [2 ]
机构
[1] Univ Dublin, Trinity Coll Dublin, Sch Comp Sci & Stat, Software Tools Grp, Dublin, Ireland
[2] Lero Irish Software Engn Res Ctr, Copenhagen, Denmark
关键词
D O I
10.1109/PACT.2015.56
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
引用
收藏
页码:488 / 489
页数:2
相关论文
共 50 条
  • [31] The STAMPede approach to thread-level speculation
    Steffan, JG
    Colohan, C
    Zhai, A
    Mowry, TC
    ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2005, 23 (03): : 253 - 300
  • [32] Exploiting Thread-level Parallelism Based on Banlancing Load for Speculative Multithreading
    Li Yuancheng
    ADVANCES IN MECHATRONICS AND CONTROL ENGINEERING III, 2014, 678 : 8 - 11
  • [33] Exploiting thread-level parallelism in the iterative solution of sparse linear systems
    Aliaga, Jose I.
    Bollhoefer, Matthias
    Martin, Alberto F.
    Quintana-Orti, Enrique S.
    PARALLEL COMPUTING, 2011, 37 (03) : 183 - 202
  • [34] Exploiting Thread-Level Parallelism on HEVC by Employing a Reference Dependency Graph
    Kim, Minwoo
    Kim, Deokho
    Kim, Kyungah
    Ro, Won Woo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2016, 26 (04) : 736 - 749
  • [35] Neither More Nor Less: Optimizing Thread-level Parallelism for GPGPUs
    Kayiran, Onur
    Jog, Adwait
    Kandemir, Mahmut T.
    Das, Chita R.
    2013 22ND INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2013, : 157 - 166
  • [36] Energy-efficient thread-level speculation
    Renau, J
    Strauss, K
    Ceze, L
    Liu, W
    Sarangi, SR
    Tuck, J
    Torrellas, J
    IEEE MICRO, 2006, 26 (01) : 80 - 91
  • [37] Architecture optimization for multimedia application exploiting data and thread-level parallelism
    Limousin, C
    Sebot, J
    Vartanian, A
    Drach, N
    JOURNAL OF SYSTEMS ARCHITECTURE, 2005, 51 (01) : 15 - 27
  • [38] Power-performance implications of thread-level parallelism on chip multiprocessors
    Li, J
    Martínez, JF
    ISPASS 2005: IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, 2005, : 124 - 134
  • [39] Efficient Thread Labeling for Monitoring Programs with Nested Parallelism
    Ha, Ok-Kyoon
    Kim, Sun-Sook
    Jun, Yong-Kee
    COMMUNICATION AND NETWORKING, PT II, 2010, 120 : 227 - +
  • [40] An Analytical Model for a GPU Architecture with Memory-level and Thread-level Parallelism Awareness
    Hong, Sunpyo
    Kim, Hyesoon
    ISCA 2009: 36TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE, 2009, : 152 - 163