High-throughput and Flexible Host Networking for Accelerated Computing

被引:0
|
作者
Skiadopoulos, Athinagoras [1 ,3 ]
Xie, Zhiqiang [1 ]
Zhao, Mark [1 ]
Cai, Qizhe [2 ]
Agarwal, Saksham [2 ]
Adelmann, Jacob [3 ]
Ahern, David [3 ]
Contavalli, Carlo [3 ]
Goldflam, Michael [3 ]
Mayatskikh, Vitaly [3 ]
Raja, Raghu [3 ,4 ]
Walton, Daniel [3 ]
Agarwal, Rachit [2 ]
Mukherjee, Shrijeet [3 ]
Kozyrakis, Christos [1 ]
机构
[1] Stanford Univ, Stanford, CA 94305 USA
[2] Cornell Univ, Ithaca, NY 14853 USA
[3] Enfabrica, Mountain View, CA USA
[4] Amazon Web Serv, Seattle, WA USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Modern network hardware is able to meet the stringent bandwidth demands of applications like GPU-accelerated AI. However, existing host network stacks offer a hard tradeoff between performance (in terms of sustained throughput when compared to network hardware capacity) and flexibility (in terms of the ability to select, customize, and extend different network protocols). This paper explores a clean-slate approach to simultaneously offer high performance and flexibility. We present a co-design of the NIC hardware and the software stack to achieve this. The key idea in our design is the physical separation of the data path (payload transfer between network and application buffers) and the control path (header processing and transport-layer decisions). The NIC enables a high-performance zero-copy data path, independent of the placement of the application (CPU, GPU, FPGA, or other accelerators). The software stack provides a flexible control path by enabling the integration of any network protocol, executing in any environment (in the kernel, in user space, or in an accelerator). We implement and evaluate ZeroNIC, a prototype that combines an FPGA-based NIC with a software stack that integrates the Linux TCP protocol. We demonstrate that ZeroNIC achieves RDMA-like throughput while maintaining the benefits of robust protocols like TCP under various network perturbations. For instance, ZeroNIC enables a single TCP flow to saturate a 100Gbps link while utilizing only 17% of a single CPU core. ZeroNIC improves NCCL and Redis throughput by 2.66x and 3.71x, respectively, over Linux TCP on a Mellanox ConnectX-6 NIC, without requiring application modifications.
引用
收藏
页码:405 / 423
页数:19
相关论文
共 50 条
  • [1] The rise of high-throughput computing
    Ning-Hui Sun
    Yun-Gang Bao
    Dong-Rui Fan
    Frontiers of Information Technology & Electronic Engineering, 2018, 19 : 1245 - 1250
  • [2] The rise of high-throughput computing
    Sun, Ning-Hui
    Bao, Yun-Gang
    Fan, Dong-Rui
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2018, 19 (10) : 1245 - 1250
  • [3] HIGH-THROUGHPUT COMPUTING IN THE SCIENCES
    Morgan, Mark
    Grimshaw, Andrew
    METHODS IN ENZYMOLOGY: COMPUTER METHODS, PART B, 2009, 467 : 197 - 227
  • [4] Combining High-Throughput Synthesis and High-Throughput Protein Crystallography for Accelerated Hit Identification
    Sutanto, Fandi
    Shaabani, Shabnam
    Oerlemans, Rick
    Eris, Deniz
    Patil, Pravin
    Hadian, Mojgan
    Wang, Meitian
    Sharpe, May Elizabeth
    Groves, Matthew R.
    Domling, Alexander
    ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2021, 60 (33) : 18231 - 18239
  • [5] Bilevel architecture for high-throughput computing
    Nevski, P
    Wenaus, T
    Vaniachine, A
    PROCEEDINGS OF CHEP 2001, 2001, : 696 - 698
  • [6] Application of high-throughput computing in bioinformatics
    Swindells, M
    Rae, M
    Pearce, M
    Moodie, S
    Miller, R
    Leach, P
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2002, 360 (1795): : 1179 - 1189
  • [7] Ethernet for High-Throughput Computing at CERN
    Krawczyk, Rafal
    Colombo, Tommaso
    Neufeld, Niko
    Pisani, Flavio
    Valat, Sebastien
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3640 - 3650
  • [8] Computing pragocytosis index for high-throughput applications
    Sebastian, Thomas
    Rittscher, Jens
    Yu, Liming
    2006 3RD IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING: MACRO TO NANO, VOLS 1-3, 2006, : 546 - +
  • [9] A high-throughput bioinformatics distributed computing platform
    Keane, TM
    Page, AJ
    McInerney, JO
    Naughton, TJ
    18TH IEEE SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, PROCEEDINGS, 2005, : 377 - 382
  • [10] Accelerated Electrosynthesis Development Enabled by High-Throughput Experimentation
    Chen, Huijie
    Mo, Yiming
    SYNTHESIS-STUTTGART, 2023, 55 (18): : 2817 - 2832