Refining instruction set architecture for high-performance multimedia processing in constrained environments

被引:6
|
作者
Lee, RB [1 ]
Fiskiran, AM [1 ]
Shi, ZJ [1 ]
Yang, M [1 ]
机构
[1] Princeton Univ, Dept Elect Engn, PALMS, Princeton, NJ 08544 USA
关键词
D O I
10.1109/ASAP.2002.1030724
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Multimedia processing in software has been significantly accelerated by the addition of subword-parallel instructions to the instruction set architectures (ISAs) of modern microprocessors. While some of these multimedia instructions are simple and effective, others are very complex, requiring large, special-purpose functional units that are not practical for constrained environments such as handheld multimedia information appliances. For such environments, low-power and low-cost are as important as the high performance required for real-time multimedia processing and the general-purpose programmability required to support an ever growing range of applications. In this paper, we introduce PLX, a concise ISA that selects the most useful features from the first two generations of multimedia instructions added to microprocessors, and explores new ISA features for high-performance yet low-cost multimedia processing with small footprint processors. PLX is unique in that it is designed from scratch as a fully subword-parallel architecture with novel features like datapath scalability from 32-bit to 128-bit words, and a new definition of predication for reducing conditional branches. We illustrate the use of PLX's architectural features with four frequently used multimedia kernels: discrete cosine transform, pixel padding, clip test and median filter. Our performance results show that a 64-bit PLX implementation achieves significant speedups compared to a basic 64-bit RISC processor and to IA-32 processors with MMX and SSE multimedia extensions. PLX's datapath scalability feature often provides an additional 2x speedup in a cost-effective way.
引用
收藏
页码:253 / 264
页数:12
相关论文
共 50 条
  • [1] PLX: An Instruction Set Architecture and Testbed for Multimedia Information Processing
    Ruby B. Lee
    A. Murat Fiskiran
    Journal of VLSI signal processing systems for signal, image and video technology, 2005, 40 : 85 - 108
  • [2] PLX: An instruction set architecture and testbed for multimedia information processing
    Lee, RB
    Fiskiran, AM
    JOURNAL OF VLSI SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2005, 40 (01): : 85 - 108
  • [3] High-performance extendable instruction set computing
    Lee, H
    Beckett, P
    Appelbe, B
    PROCEEDINGS OF THE 6TH AUSTRALASIAN COMPUTER SYSTEMS ARCHITECTURE CONFERENCE, ACSAC 2001, 2001, 23 (04): : 89 - 94
  • [4] General architecture and instruction set enhancements for multimedia applications
    Assaf, Mansour
    Rajesh, Aparna
    WMSCI 2007: 11TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS, 2007, : 223 - 228
  • [5] PLX: A fully subword-parallel instruction set architecture for fast scalable multimedia processing
    Lee, RB
    Fiskiran, AM
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A117 - A120
  • [6] Performance scalability of multimedia instruction set extensions
    Cheresiz, D
    Juurlink, B
    Vassiliadis, S
    Wijshoff, H
    EURO-PAR 2002 PARALLEL PROCESSING, PROCEEDINGS, 2002, 2400 : 849 - 859
  • [7] Instruction set architecture enhancements for video processing
    van de Waerdt, JW
    Vassiliadis, S
    16TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURE AND PROCESSORS, PROCEEDINGS, 2005, : 146 - 153
  • [8] Multimedia-application-driven instruction set architecture simulation
    Barbieri, I
    Bariani, M
    Cabitto, A
    Raggio, M
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOL I AND II, PROCEEDINGS, 2002, : A169 - A172
  • [10] High-performance VLSI architecture for video processing
    Navarro, H
    Montiel-Nelson, JA
    Sosa, J
    García, JC
    Sarmiento, R
    Nooshabadi, S
    VLSI CIRCUITS AND SYSTEMS, 2003, 5117 : 175 - 186