The vector floating-point unit in a synergistic processor element of a CELL processor

被引:29
|
作者
Mueller, SM
Jacobi, C
Oh, HJ
Tran, KD
Cottier, SR
Michael, BW
Nishikawa, H
Totsuka, Y
Namatame, T
Yano, N
Machida, T
Dhong, SH
机构
来源
17th IEEE Symposium on Computer Arithmetic, Proceedings | 2005年
关键词
D O I
10.1109/ARITH.2005.45
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The floating-point unit in the Synergistic Processor Element of the 1st generation multi-core CELL Processor is described. The FPU supports 4-way SIMD single precision and integer operations and 2-way SIMD double precision operations. The design required a high-frequency, low latency, power and area efficiency with primary application to the multimedia streaming workloads, such as 3D graphics. The FPU has 3 different latencies, optimizing the performance critical single precision FMA operations, which are executed with a 6-cycle latency at an 11FO4 cycle time. The latency includes the global forwarding of the result. These challenging performance, power and area goals were achieved through the co-design of architecture and implementation with optimizations at all levels of the design. This paper focuses on the logical and algorithmic aspects of the FPU we developed, to achieve these goals.
引用
收藏
页码:59 / 67
页数:9
相关论文
共 50 条