共 50 条
- [31] Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
- [32] Efficient Tuning and Inference for Large Language Models on Textual Graphs PROCEEDINGS OF THE THIRTY-THIRD INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2024, 2024, : 5734 - 5742
- [33] Energy Efficient Data Collection in Large-Scale Internet of Things via Computation Offloading IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (03): : 4176 - 4187
- [34] An Efficient Computation Offloading Architecture for the Internet of Things (IoT) Devices 2017 14TH IEEE ANNUAL CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE (CCNC), 2017, : 728 - 731
- [35] Distributed Inference and Fine-tuning of Large Language Models Over The Internet ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [37] Large Language Models (LLMs) Inference Offloading and Resource Allocation in Cloud-Edge Networks: An Active Inference Approach 2023 IEEE 98TH VEHICULAR TECHNOLOGY CONFERENCE, VTC2023-FALL, 2023,
- [39] DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,