Runtime Efficiency-Accuracy Tradeoff Using Configurable Floating Point Multiplier

被引：11

作者：

Peroni, Daniel ^{[1
]}

Imani, Mohsen ^{[1
]}

Rosing, Tajana Simuni ^{[1
]}

机构：

[1] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2020年 / 39卷 / 02期

关键词：

Approximate computing; energy efficiency; floating point unit (FPU); GPU;

D O I：

10.1109/TCAD.2018.2885317

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Many applications, such as machine learning and sensor data analysis, are statistical in nature and can tolerate some level of inaccuracy in their computation. Approximate computing is a viable method to save energy and increase performance by controllably trading off energy for accuracy. In this paper, we propose a tiered approximate floating point multiplier, called CFPU, which significantly reduces energy consumption and improves the performance of multiplication at a slight cost in accuracy. The floating point multiplication is approximated by replacing the costly mantissa multiplication step of the operation with lower energy alternatives. We process the data by using one of the three modes: a basic approximate mode, an intermediate approximate mode, or on the exact hardware, depending on the accuracy requirements. We evaluate the efficiency of the proposed CFPU on a wide range of applications including twelve general OpenCL ones and three machine learning applications. Our results show that using the first CFPU approximation mode results in 3.5x energy-delay product (EDP) improvement, compared to a GPU using traditional floating point units (FPUs), while ensuring less than 10% average relative error. Adding the second mode further increases the EDP improvement to 4.1x, compared to an unmodified FPU, for less than 10% error. In addition, our results show that the proposed CFPU can achieve 2.8x EDP improvement for multiply operations as compared to state-of-the-art approximate multipliers.

引用

页码：346 / 358

页数：13

共 50 条

[31] VLSI Implementation of Double-Precision Floating-Point Multiplier Using Karatsuba Technique
Manish Kumar Jaiswal
Ray C. C. Cheung
Circuits, Systems, and Signal Processing, 2013, 32 : 15 - 27
[32] VLSI Implementation of Double-Precision Floating-Point Multiplier Using Karatsuba Technique
Jaiswal, Manish Kumar
Cheung, Ray C. C.
CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2013, 32 (01) : 15 - 27
[33] Single-Precision Floating Point Matrix Multiplier Using Low-Power Arithmetic Circuits
Gargave, Soumya
Agrawal, Yash
Parekh, Rutu
ADVANCES IN POWER SYSTEMS AND ENERGY MANAGEMENT, 2018, 436
[34] An Area-Efficient 32-bit Floating Point Multiplier using Hybrid GPPs Addition
Nesam, J. Jean Jenifer
Sivanantham, S.
2017 INTERNATIONAL CONFERENCE ON MICROELECTRONIC DEVICES, CIRCUITS AND SYSTEMS (ICMDCS), 2017,
[35] BFP-CIM: Runtime Energy-Accuracy Scalable Computing-in-Memory-Based DNN Accelerator Using Dynamic Block-Floating-Point Arithmetic
Chang, Cheng-Yang
Huang, Chi-Tse
Chuang, Yu-Chuan
Chou, Kuang-Chao
Wu, An-Yeu
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (05) : 2079 - 2092
[36] Energy efficient IEEE 754 floating point multiplier using dual spacer delay insensitive logic
Jyothula, Sudhakar
Sushma, K.
CIRCUIT WORLD, 2017, 43 (02) : 72 - 79
[37] Design of Pipelined Parity Preserving Double Precision Reversible Floating Point Multiplier Using 90 nm Technology
Mhaboobkhan, F.
Kokila, K.
Jothikha, R.
Preethikha, K. Lakshmi
2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 739 - 744
[38] A scaleable FIR filter using 32-bit floating-point complex arithmetic on a configurable computing machine
Walters, A
Athanas, P
IEEE SYMPOSIUM ON FPGAS FOR CUSTOM COMPUTING MACHINES, PROCEEDINGS, 1998, : 333 - 334
[39] An efficient floating point multiplier design for high speed applications using Karatsuba algorithm and Urdhva-Tiryagbhyam algorithm
Arish, S.
Sharma, R. K.
2015 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION (ICSC), 2015, : 303 - 308
[40] FPGA Implementation of 32 Bit Complex Floating Point Multiplier Using Vedic Real Multipliers with Minimum Path Delay
Rao, K. Deergha
Muralikrishna, P. V.
Gangadhar, Ch.
2018 5TH IEEE UTTAR PRADESH SECTION INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING (UPCON), 2018, : 23 - 28

← 1 2 3 4 5 →