Decoupled Processors Architecture for Accelerating Data Intensive Applications using Scratch-Pad Memory Hierarchy

被引:0
|
作者
Milidonis, Athanasios [1 ]
Alachiotis, Nikolaos [1 ]
Porpodas, Vasileios [1 ]
Michail, Harris [1 ]
Panagiotakopoulos, Georgios [1 ]
Kakarountas, Athanasios P. [1 ]
Goutis, Costas E. [1 ]
机构
[1] Univ Patras, VLSI Design Lab, Dept Elect & Comp Engn, Patras, Greece
关键词
Decoupled; Scratch pad;
D O I
10.1007/s11265-009-0393-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present an architecture of decoupled processors with a memory hierarchy consisting only of scratch-pad memories, and a main memory. This architecture exploits the more efficient pre-fetching of Decoupled processors, that make use of the parallelism between address computation and application data processing, which mainly exists in streaming applications. This benefit combined with the ability of scratch-pad memories to store data with no conflict misses and low energy per access contributes significantly for increasing the system's performance. The application code is split in two parallel programs the first runs on the Access processor and computes the addresses of the data in the memory hierarchy. The second processes the application data and runs on the Execute processor, a processor with a limited address space-just the register file addresses. Each transfer of any block in the memory hierarchy up to the Execute processor's register file is controlled by the Access processor and the DMA units. This strongly differentiates this architecture from traditional uniprocessors and existing decoupled processors with cache memory hierarchies. The architecture is compared in performance with uniprocessor architectures with (a) scratch-pad and (b) cache memory hierarchies and (c) the existing decoupled architectures, showing its higher normalized performance. The reason for this gain is the efficiency of data transferring that the scratch-pad memory hierarchy provides combined with the ability of the Decoupled processors to eliminate memory latency using memory management techniques for transferring data instead of fixed prefetching methods. Experimental results show that the performance is increased up to almost 2 times compared to uniprocessor architectures with scratch-pad and up to 3.7 times compared to the ones with cache. The proposed architecture achieves the above performance without having penalties in energy delay product costs.
引用
收藏
页码:281 / 296
页数:16
相关论文
共 42 条
  • [21] Extended control flow graph based performance optimization using scratch-pad memory
    Pu, HL
    Ming, L
    Jing, J
    DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, VOLS 1 AND 2, PROCEEDINGS, 2005, : 828 - 829
  • [22] Low-Power Low-Latency Data Allocation for Hybrid Scratch-Pad Memory
    Qiu, Meikang
    Chen, Zhi
    Liu, Meiqin
    IEEE EMBEDDED SYSTEMS LETTERS, 2014, 6 (04) : 69 - 72
  • [23] Improving scratch-pad memory reliability through compiler-guided data block duplication
    Li, F
    Chen, G
    Kandemir, M
    Kolcu, I
    ICCAD-2005: INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, DIGEST OF TECHNICAL PAPERS, 2005, : 1002 - 1005
  • [24] EXTENDED CONTROL FLOW GRAPH BASED PERFORMANCE AND ENERGY CONSUMPTION OPTIMIZATION USING SCRATCH-PAD MEMORY
    Wang Xuexiang
    Pu Hanlai
    Yang Jun
    Shi Longxing
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2009, 18 (04) : 697 - 711
  • [25] Reconfigurable Processing-in-Memory Architecture for Data Intensive Applications
    Bavikadi, Sathwika
    Sutradhar, Purab Ranjan
    Ganguly, Amlan
    Dinakarrao, Sai Manoj Pudukotai
    PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE ON VLSI DESIGN, VLSID 2024 AND 23RD INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS, ES 2024, 2024, : 222 - 227
  • [26] Accelerating Biomedical Data-Intensive Applications using MapReduce
    Han, Liangxiu
    Ong, Hwee Yong
    2012 ACM/IEEE 13TH INTERNATIONAL CONFERENCE ON GRID COMPUTING (GRID), 2012, : 49 - 57
  • [27] An intelligent memory caching architecture for data-intensive multimedia applications
    Aaqif Afzaal Abbasi
    Sameen Javed
    Shahaboddin Shamshirband
    Multimedia Tools and Applications, 2021, 80 : 16743 - 16761
  • [28] An intelligent memory caching architecture for data-intensive multimedia applications
    Abbasi, Aaqif Afzaal
    Javed, Sameen
    Shamshirband, Shahaboddin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16743 - 16761
  • [29] Accelerating memory and I/O intensive HPC applications using hardware compression
    AlSaleh, Saleh
    Elrabaa, Muhammad E. S.
    El-Maleh, Aiman
    Daud, Khaled
    Hroub, Ayman
    Mudawar, Muhamed
    Tonellot, Thierry
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 193
  • [30] Path-based Processing using In-Memory Systolic Arrays for Accelerating Data-Intensive Applications
    Rashed, Muhammad Rashedul Haq
    Thijssen, Sven
    Jha, Sumit Kumar
    Zheng, Hao
    Ewetz, Rickard
    2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,