共 50 条
- [21] Performance Efficient Layer-aware DNN Inference Task Scheduling in GPU Cluster 2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 2242 - 2247
- [22] Towards Fast GPU-based Sparse DNN Inference: A Hybrid Compute Model 2022 IEEE HIGH PERFORMANCE EXTREME COMPUTING VIRTUAL CONFERENCE (HPEC), 2022,
- [23] Energy Profiling of DNN Accelerators 2023 26TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN, DSD 2023, 2023, : 53 - 60
- [25] Automated Runtime-Aware Scheduling for Multi-Tenant DNN Inference on GPU 2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD), 2021,
- [26] Jily: Cost-Aware AutoScaling of Heterogeneous GPU for DNN Inference in Public Cloud 2019 IEEE 38TH INTERNATIONAL PERFORMANCE COMPUTING AND COMMUNICATIONS CONFERENCE (IPCCC), 2019,
- [27] Model sharing for GPU-accelerated DNN inference in big data processing systems Qinghua Daxue Xuebao/Journal of Tsinghua University, 2022, 62 (09): : 1435 - 1441
- [28] Exploring In-Memory Accelerators and FPGAs for Latency-Sensitive DNN Inference on Edge Servers 2024 IEEE CLOUD SUMMIT, CLOUD SUMMIT 2024, 2024, : 1 - 6