Shielded Representations: Protecting Sensitive Attributes Through Iterative Gradient-Based Projection

被引:0
|
作者
Iskander, Shadi [1 ]
Radinsky, Kira [1 ]
Belinkov, Yonatan [1 ]
机构
[1] Technion Israel Inst Technol, Haifa, Israel
基金
以色列科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural language processing models tend to learn and encode social biases present in the data. One popular approach for addressing such biases is to eliminate encoded information from the model's representations. However, current methods are restricted to removing only linearly encoded information. In this work, we propose Iterative Gradient-Based Projection (IGBP), a novel method for removing non-linear encoded concepts from neural representations. Our method consists of iteratively training neural classifiers to predict a particular attribute we seek to eliminate, followed by a projection of the representation on a hypersurface, such that the classifiers become oblivious to the target attribute. We evaluate the effectiveness of our method on the task of removing gender and race information as sensitive attributes. Our results demonstrate that IGBP is effective in mitigating bias through intrinsic and extrinsic evaluations, with minimal impact on downstream task accuracy.(1)
引用
收藏
页码:5961 / 5977
页数:17
相关论文
共 50 条
  • [31] Robust monotone gradient-based discrete-time iterative learning control
    Owens, D. H.
    Hatonen, J. J.
    Daley, S.
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2009, 19 (06) : 634 - 661
  • [32] Iterative reconstruction methods in atmospheric tomography: FEWHA, Kaczmarz and Gradient-based algorithm
    Ramlau, R.
    Saxenhuber, D.
    Yudytskiy, M.
    ADAPTIVE OPTICS SYSTEMS IV, 2014, 9148
  • [33] New proof of the gradient-based iterative algorithm for the Sylvester conjugate matrix equation
    Zhang, Huamin
    Yin, Hongcai
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2017, 74 (12) : 3260 - 3270
  • [34] Parallel programming of gradient-based iterative image reconstruction schemes for optical tomography
    Hielscher, AH
    Bartel, S
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2004, 73 (02) : 101 - 113
  • [35] Use of penalty terms in gradient-based iterative reconstruction schemes for optical tomography
    Hielscher, AH
    Bartel, S
    JOURNAL OF BIOMEDICAL OPTICS, 2001, 6 (02) : 183 - 192
  • [36] Use of a priori information and penalty terms in gradient-based iterative reconstruction schemes
    Hielscher, AH
    Klose, AD
    OPTICAL TOMOGRAPHY AND SPECTROSCOPY OF TISSUE III, PROCEEDINGS OF, 1999, 3597 : 36 - 44
  • [37] Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space
    Monath, Nicholas
    Zaheer, Manzil
    Silva, Daniel
    McCallum, Andrew
    Ahmed, Amr
    KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 714 - 722
  • [38] Accelerating Attention through Gradient-Based Learned Runtime Pruning
    Li, Zheng
    Ghodrati, Soroush
    Yazdanbakhsh, Amir
    Esmaeilzadeh, Hadi
    Kang, Mingu
    PROCEEDINGS OF THE 2022 THE 49TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '22), 2022, : 902 - 915
  • [39] Unlearning Backdoor Attacks through Gradient-Based Model Pruning
    Dunnett, Kealan
    Arablouei, Reza
    Miller, Dimity
    Dedeoglu, Volkan
    Jurdak, Raja
    2024 54TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS WORKSHOPS, DSN-W 2024, 2024, : 46 - 54
  • [40] Gradient-based iterative identification for Wiener nonlinear systems with non-uniform sampling
    Zhou, Lincheng
    Li, Xiangli
    Pan, Feng
    NONLINEAR DYNAMICS, 2014, 76 (01) : 627 - 634