Compiler and hardware support for reducing the synchronization of speculative threads

被引:13
|
作者
Zhai, Antonia [1 ]
Steffan, J. Gregory [2 ]
Colohan, Christopher B. [3 ]
Mowry, Todd C. [4 ]
机构
[1] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
[2] Univ Toronto, Toronto, ON, Canada
[3] Google, Ann Arbor, MI 48104 USA
[4] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
关键词
design; experimentation; performance; thread-level speculation; chip-multiprocessing; automatic parallelization; instruction scheduling;
D O I
10.1145/1369396.1369399
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Thread-level speculation (TLS) allows us to automatically parallelize general-purpose programs by supporting parallel execution of threads that might not actually be independent. In this article, we focus on one important limitation of program performance under TLS, which stalls as a result of synchronizing and forwarding scalar values between speculative threads that would otherwise cause frequent data dependences and, hence, failed speculation. Using SPECint benchmarks that have been automatically transformed by our compiler to exploit TLS, we present, evaluate in detail, and compare both compiler and hardware techniques for improving the communication of scalar values. We find that through our dataflow algorithms for three increasingly aggressive instruction scheduling techniques, the compiler can drastically reduce the critical forwarding path introduced by the synchronization and forwarding of scalar values. We also show that hardware techniques for reducing synchronization can be complementary to compiler scheduling, but that the additional performance benefits are minimal and are generally not worth the cost.
引用
收藏
页码:1 / 33
页数:33
相关论文
共 50 条
  • [31] COMPILER ALGORITHMS FOR SYNCHRONIZATION
    MIDKIFF, SP
    PADUA, DA
    IEEE TRANSACTIONS ON COMPUTERS, 1987, 36 (12) : 1485 - 1495
  • [32] System-on-a-chip processor synchronization support in hardware
    Saglam, BE
    Mooney, VJ
    DESIGN, AUTOMATION AND TEST IN EUROPE, CONFERENCE AND EXHIBITION 2001, PROCEEDINGS, 2001, : 633 - 639
  • [33] Cooperative prefetching: Compiler and hardware support for effective instruction prefetching in modern processors
    Luk, CK
    Mowry, TC
    31ST ANNUAL ACM/IEEE INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, PROCEEDINGS, 1998, : 182 - 193
  • [34] Using hardware-transactional-memory support to implement speculative task execution
    Salamanca, Juan
    Baldassin, Alexandro
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 192
  • [35] A general compiler framework for speculative multithreaded processors
    Bhowmik, A
    Franklin, M
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2004, 15 (08) : 713 - 724
  • [36] CoSpec: Compiler Directed Speculative Intermittent Computation
    Choi, Jongouk
    Liu, Qingrui
    Jung, Changhee
    MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 399 - 412
  • [37] General compiler framework for speculative optimizations using data speculative code motion
    Dai, XR
    Zhai, A
    Hsu, WC
    Yew, PC
    CGO 2005: INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION, 2005, : 280 - 290
  • [38] Tolerating dependences between large speculative threads via sub-threads
    Colohan, Christopher B.
    Ailamaki, Anastassia
    Steffan, J. Gregory
    Mowry, Todd C.
    33RD INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHTIECTURE, PROCEEDINGS, 2006, : 216 - 226
  • [39] Fly - A modifiable hardware compiler
    Ho, CH
    Leong, PHW
    Tsoi, KH
    Ludewig, R
    Zipf, P
    Ortiz, AG
    Glesner, M
    FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, PROCEEDINGS: RECONFIGURABLE COMPUTING IS GOING MAINSTREAM, 2002, 2438 : 381 - 390
  • [40] A C to hardware/software compiler
    Bazargan, K
    Kastner, R
    Ogrenci, S
    Sarrafzadeh, M
    2000 IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2000, : 331 - 332