共 50 条
- [1] Accelerating DNN Inference with GraphBLAS and the GPU 2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
- [2] Efficient Adaptive Batching of DNN Inference Services for Improved Latency 38TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN 2024, 2024, : 197 - 200
- [4] Balanced Sparsity for Efficient DNN Inference on GPU THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 5676 - 5683
- [6] Performance Characterization of Containerized DNN Training and Inference on Edge Accelerators 2023 IEEE 30TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING, DATA, AND ANALYTICS, HIPC 2023, 2023, : 127 - 131
- [7] TFApprox: Towards a Fast Emulation of DNN Approximate Hardware Accelerators on GPU PROCEEDINGS OF THE 2020 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2020), 2020, : 294 - 297
- [8] HarmonyBatch: Batching multi-SLO DNN Inference with Heterogeneous Serverless Functions 2024 IEEE/ACM 32ND INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE, IWQOS, 2024,
- [9] Occamy: Memory-efficient GPU Compiler for DNN Inference 2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
- [10] Mixed Precision Quantization for ReRAM-based DNN Inference Accelerators 2021 26TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE (ASP-DAC), 2021, : 372 - 377