共 50 条
- [1] Adaptive Contrastive Knowledge Distillation for BERT Compression FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 8941 - 8953
- [3] Multi-Granularity Structural Knowledge Distillation for Language Model Compression PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 1001 - 1011
- [5] Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 9678 - 9696
- [7] AD-KD: Attribution-Driven Knowledge Distillation for Language Model Compression PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 8449 - 8465
- [8] HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3126 - 3136
- [9] From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11547 - 11555
- [10] Fine-Tuning via Mask Language Model Enhanced Representations Based Contrastive Learning and Application Computer Engineering and Applications, 60 (17): : 129 - 138