Learning to Unlearn: Instance-Wise Unlearning for Pre-trained Classifiers

被引:0
|
作者
Cha, Sungmin [1 ]
Cho, Sungjun [2 ]
Hwang, Dasol [2 ]
Lee, Honglak [2 ]
Moon, Taesup [3 ]
Lee, Moontae [2 ,4 ]
机构
[1] New York Univ, New York, NY USA
[2] LG AI Res, Seoul, South Korea
[3] Seoul Natl Univ, INMC, ASRI, Seoul, South Korea
[4] Univ Illinois, Chicago, IL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since the recent advent of regulations for data protection (e.g., the General Data Protection Regulation), there has been increasing demand in deleting information learned from sensitive data in pre-trained models without retraining from scratch. The inherent vulnerability of neural networks towards adversarial attacks and unfairness also calls for a robust method to remove or correct information in an instancewise fashion, while retaining the predictive performance across remaining data. To this end, we consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model, by either misclassifying each instance away from its original prediction or relabeling the instance to a different label. We also propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information. Both methods only require the pretrained model and data instances to forget, allowing painless application to real-life settings where the entire training set is unavailable. Through extensive experimentation on various image classification benchmarks, we show that our approach effectively preserves knowledge of remaining data while unlearning given instances in both single-task and continual unlearning scenarios.
引用
收藏
页码:11186 / 11194
页数:9
相关论文
共 50 条
  • [31] Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language Model
    Kong, Jin-Woo
    Oh, Byoung-Doo
    Kim, Chulho
    Kim, Yu-Seop
    APPLIED SCIENCES-BASEL, 2024, 14 (03):
  • [32] Pre-trained Visual Dynamics Representations for Efficient Policy Learning
    Luc, Hao
    Zhou, Bohan
    Lu, Zongqing
    COMPUTER VISION - ECCV 2024, PT LXXXI, 2025, 15139 : 249 - 267
  • [33] RanPAC: Random Projections and Pre-trained Models for Continual Learning
    McDonnell, Mark D.
    Gong, Dong
    Parveneh, Amin
    Abbasnejad, Ehsan
    van den Hengel, Anton
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [34] CODEEDITOR: Learning to Edit Source Code with Pre-trained Models
    Li, Jia
    Li, Ge
    Li, Zhuo
    Jin, Zhi
    Hu, Xing
    Zhang, Kechi
    Fu, Zhiyi
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2023, 32 (06)
  • [35] On Pre-trained Image Features and Synthetic Images for Deep Learning
    Hinterstoisser, Stefan
    Lepetit, Vincent
    Wohlhart, Paul
    Konolige, Kurt
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT I, 2019, 11129 : 682 - 697
  • [36] Zero-shot Learning for Subdiscrimination in Pre-trained Models
    Dominguez-Mateos, Francisco
    O'Brien, Vincent
    Garland, James
    Furlong, Ryan
    Palacios-Alonso, Daniel
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2025, 31 (01) : 93 - 110
  • [37] AWEncoder: Adversarial Watermarking Pre-Trained Encoders in Contrastive Learning
    Zhang, Tianxing
    Wu, Hanzhou
    Lu, Xiaofeng
    Han, Gengle
    Sun, Guangling
    APPLIED SCIENCES-BASEL, 2023, 13 (06):
  • [38] Collaborative Learning across Heterogeneous Systems with Pre-Trained Models
    Hoang, Trong Nghia
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 20, 2024, : 22668 - 22668
  • [39] Learning and Evaluating a Differentially Private Pre-trained Language Model
    Hoory, Shlomo
    Feder, Amir
    Tendler, Avichai
    Cohen, Alon
    Erell, Sofia
    Laish, Itay
    Nakhost, Hootan
    Stemmer, Uri
    Benjamini, Ayelet
    Hassidim, Avinatan
    Matias, Yossi
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1178 - 1189
  • [40] Meta Distant Transfer Learning for Pre-trained Language Models
    Wang, Chengyu
    Pan, Haojie
    Qiu, Minghui
    Yang, Fei
    Huang, Jun
    Zhang, Yin
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9742 - 9752