Shielded Representations: Protecting Sensitive Attributes Through Iterative Gradient-Based Projection

被引:0
|
作者
Iskander, Shadi [1 ]
Radinsky, Kira [1 ]
Belinkov, Yonatan [1 ]
机构
[1] Technion Israel Inst Technol, Haifa, Israel
基金
以色列科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural language processing models tend to learn and encode social biases present in the data. One popular approach for addressing such biases is to eliminate encoded information from the model's representations. However, current methods are restricted to removing only linearly encoded information. In this work, we propose Iterative Gradient-Based Projection (IGBP), a novel method for removing non-linear encoded concepts from neural representations. Our method consists of iteratively training neural classifiers to predict a particular attribute we seek to eliminate, followed by a projection of the representation on a hypersurface, such that the classifiers become oblivious to the target attribute. We evaluate the effectiveness of our method on the task of removing gender and race information as sensitive attributes. Our results demonstrate that IGBP is effective in mitigating bias through intrinsic and extrinsic evaluations, with minimal impact on downstream task accuracy.(1)
引用
收藏
页码:5961 / 5977
页数:17
相关论文
共 50 条
  • [21] The Gradient-Based Iterative Estimation Algorithms for Bilinear Systems with Autoregressive Noise
    Li, Meihang
    Liu, Ximei
    Ding, Feng
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2017, 36 (11) : 4541 - 4568
  • [22] Gradient-based iterative parameter estimation for Box-Jenkins systems
    Wang, Dongqing
    Yang, Guowei
    Ding, Ruifeng
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2010, 60 (05) : 1200 - 1208
  • [23] Gradient-based iterative algorithm for the extended coupled Sylvester matrix equations
    Zhang Huamin
    2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 1562 - 1567
  • [24] Gradient-based iterative identification for nonuniform sampling output error systems
    Xie, Li
    Yang, Huizhong
    JOURNAL OF VIBRATION AND CONTROL, 2011, 17 (03) : 471 - 478
  • [25] Gradient-Based Inverse Risk-Sensitive Reinforcement Learning
    Mazumdar, Eric
    Ratliff, Lillian J.
    Fiez, Tanner
    Sastry, S. Shankar
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [26] Leaping through Time with Gradient-Based Adaptation for Recommendation
    Chairatanakul, Nuttapong
    Hoang, N. T.
    Liu, Xin
    Murata, Tsuyoshi
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 6141 - 6149
  • [27] Gradient-based Hyperparameter Optimization through Reversible Learning
    Maclaurin, Dougal
    Duvenaud, David
    Adams, Ryan P.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 2113 - 2122
  • [28] Protecting Sensitive Attributes by Adversarial Training Through Class-Overlapping Techniques
    Lin, Tsung-Hsien
    Lee, Ying-Shuo
    Chang, Fu-Chieh
    Chang, J. Morris
    Wu, Pei-Yuan
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 1283 - 1294
  • [29] Convergence of gradient-based iterative solution of coupled Markovian jump Lyapunov equations
    Zhou, Bin
    Lam, James
    Duan, Guang-Ren
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2008, 56 (12) : 3070 - 3078
  • [30] Gradient-based iterative approach for solving constrained systems of linear matrix equations
    Shirilord, Akbar
    Dehghan, Mehdi
    COMPUTATIONAL & APPLIED MATHEMATICS, 2024, 43 (04):