Learning to Unlearn: Instance-Wise Unlearning for Pre-trained Classifiers

被引:0
|
作者
Cha, Sungmin [1 ]
Cho, Sungjun [2 ]
Hwang, Dasol [2 ]
Lee, Honglak [2 ]
Moon, Taesup [3 ]
Lee, Moontae [2 ,4 ]
机构
[1] New York Univ, New York, NY USA
[2] LG AI Res, Seoul, South Korea
[3] Seoul Natl Univ, INMC, ASRI, Seoul, South Korea
[4] Univ Illinois, Chicago, IL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since the recent advent of regulations for data protection (e.g., the General Data Protection Regulation), there has been increasing demand in deleting information learned from sensitive data in pre-trained models without retraining from scratch. The inherent vulnerability of neural networks towards adversarial attacks and unfairness also calls for a robust method to remove or correct information in an instancewise fashion, while retaining the predictive performance across remaining data. To this end, we consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model, by either misclassifying each instance away from its original prediction or relabeling the instance to a different label. We also propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information. Both methods only require the pretrained model and data instances to forget, allowing painless application to real-life settings where the entire training set is unavailable. Through extensive experimentation on various image classification benchmarks, we show that our approach effectively preserves knowledge of remaining data while unlearning given instances in both single-task and continual unlearning scenarios.
引用
收藏
页码:11186 / 11194
页数:9
相关论文
共 50 条
  • [1] Neural Network Surgery: Injecting Data Patterns into Pre-trained Models with Minimal Instance-wise Side Effects
    Zhang, Zhiyuan
    Ren, Xuancheng
    Su, Qi
    Sun, Xu
    He, Bin
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5453 - 5466
  • [2] Learning to Transform for Generalizable Instance-wise Invariance
    Singhal, Utkarsh
    Esteves, Carlos
    Makadia, Ameesh
    Yu, Stella X.
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6188 - 6198
  • [3] Feature Unlearning for Pre-trained GANs and VAEs
    Moon, Saemi
    Cho, Seunghyuk
    Kim, Dongwoo
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 19, 2024, : 21420 - 21428
  • [4] Learning Instance-wise Sparsity for Accelerating Deep Models
    Liu, Chuanjian
    Wang, Yunhe
    Han, Kai
    Xu, Chunjing
    Xu, Chang
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3001 - 3007
  • [5] Implicit Stereotypes in Pre-Trained Classifiers
    Dehouche, Nassim
    IEEE ACCESS, 2021, 9 : 167936 - 167947
  • [6] Instance-wise multi-view representation learning
    Li, Dan
    Wang, Haibao
    Wang, Yufeng
    Wang, Shengpei
    INFORMATION FUSION, 2023, 91 : 612 - 622
  • [7] Machine Unlearning of Pre-trained Large Language Models
    Yao, Jin
    Chien, Eli
    Du, Minxin
    Niu, Xinyao
    Wang, Tianhao
    Cheng, Zezhou
    Yue, Xiang
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 8403 - 8419
  • [8] Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning
    Chan, Alex J.
    van der Schaar, Mihaela
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [9] Online Domain Adaptation of a Pre-Trained Cascade of Classifiers
    Jain, Vidit
    Learned-Miller, Erik
    2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 577 - 584
  • [10] Class-wise and instance-wise contrastive learning for zero-shot learning based on VAEGAN
    Zheng, Baolong
    Li, Zhanshan
    Li, Jingyao
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 272