Compiler and hardware support for reducing the synchronization of speculative threads

被引:13
|
作者
Zhai, Antonia [1 ]
Steffan, J. Gregory [2 ]
Colohan, Christopher B. [3 ]
Mowry, Todd C. [4 ]
机构
[1] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
[2] Univ Toronto, Toronto, ON, Canada
[3] Google, Ann Arbor, MI 48104 USA
[4] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
design; experimentation; performance; thread-level speculation; chip-multiprocessing; automatic parallelization; instruction scheduling;
D O I
10.1145/1369396.1369399
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Thread-level speculation (TLS) allows us to automatically parallelize general-purpose programs by supporting parallel execution of threads that might not actually be independent. In this article, we focus on one important limitation of program performance under TLS, which stalls as a result of synchronizing and forwarding scalar values between speculative threads that would otherwise cause frequent data dependences and, hence, failed speculation. Using SPECint benchmarks that have been automatically transformed by our compiler to exploit TLS, we present, evaluate in detail, and compare both compiler and hardware techniques for improving the communication of scalar values. We find that through our dataflow algorithms for three increasingly aggressive instruction scheduling techniques, the compiler can drastically reduce the critical forwarding path introduced by the synchronization and forwarding of scalar values. We also show that hardware techniques for reducing synchronization can be complementary to compiler scheduling, but that the additional performance benefits are minimal and are generally not worth the cost.
引用
收藏
页码:1 / 33
页数:33
相关论文
共 50 条
  • [41] HARDWARE COMPILER SPEAKS PASCAL
    COHEN, C
    ELECTRONICS, 1983, 56 (17): : 72 - &
  • [42] Limits of Synchronization Accuracy Using Hardware Support in IEEE 1588
    Loschmidt, Patrick
    Exel, Reinhard
    Nagy, Anetta
    Gaderer, Georg
    2008 IEEE INTERNATIONAL SYMPOSIUM ON PRECISION CLOCK SYNCHRONIZATION FOR MEASUREMENT, CONTROL AND COMMUNICATION, 2008, : 12 - 16
  • [43] Hardware support for release consistency with queue-based synchronization
    Lee, JB
    Jhon, CS
    1997 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 1997, : 144 - 151
  • [44] A speculative mechanism for barrier synchronization
    Meng, Jinglei
    Chen, Tianzhou
    Pan, Ping
    Yao, Jun
    Wu, Minghui
    2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS), 2014, : 858 - 865
  • [45] Register File Partitioning and Compiler Support for Reducing Embedded Processor Power Consumption
    Guan, Xuan
    Fei, Yunsi
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2010, 18 (08) : 1248 - 1252
  • [46] Dynamically Dispatching Speculative Threads to Improve Sequential Execution
    Luo, Yangchun
    Zhai, Antonia
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2012, 9 (03)
  • [47] Architectural support for synchronization of threads accessing variable-sized units of virtual memory
    Jutla, DN
    Bodorik, P
    PROCEEDINGS OF THE THIRTY-FIRST HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, VOL III: EMERGING TECHNOLOGIES TRACK, 1998, : 197 - 206
  • [48] Compiler estimation of load imbalance overhead in speculative parallelization
    Dou, JL
    Cintra, M
    13TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURE AND COMPILATION TECHNIQUES, PROCEEDINGS, 2004, : 203 - 214
  • [49] Speculative pre-execution assisted by compiler (SPEAR)
    Ro, Won W.
    Gaudiot, Jean-Luc
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2006, 66 (08) : 1076 - 1089
  • [50] Addressing the challenges of synchronization/communication and debugging support in hardware/software cosimulation
    Agrawal, Banit
    Sherwood, Timothy
    Shin, Chulho
    Yoon, Simon
    21ST INTERNATIONAL CONFERENCE ON VLSI DESIGN: HELD JOINTLY WITH THE 7TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS, PROCEEDINGS, 2008, : 354 - +