Shielded Representations: Protecting Sensitive Attributes Through Iterative Gradient-Based Projection

被引:0
|
作者
Iskander, Shadi [1 ]
Radinsky, Kira [1 ]
Belinkov, Yonatan [1 ]
机构
[1] Technion Israel Inst Technol, Haifa, Israel
基金
以色列科学基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Natural language processing models tend to learn and encode social biases present in the data. One popular approach for addressing such biases is to eliminate encoded information from the model's representations. However, current methods are restricted to removing only linearly encoded information. In this work, we propose Iterative Gradient-Based Projection (IGBP), a novel method for removing non-linear encoded concepts from neural representations. Our method consists of iteratively training neural classifiers to predict a particular attribute we seek to eliminate, followed by a projection of the representation on a hypersurface, such that the classifiers become oblivious to the target attribute. We evaluate the effectiveness of our method on the task of removing gender and race information as sensitive attributes. Our results demonstrate that IGBP is effective in mitigating bias through intrinsic and extrinsic evaluations, with minimal impact on downstream task accuracy.(1)
引用
收藏
页码:5961 / 5977
页数:17
相关论文
共 50 条
  • [1] Robust gradient-based iterative learning control
    Owens, D. H.
    Haetoenen, J.
    Daley, S.
    2007 INTERNATIONAL WORKSHOP ON MULTIDIMENSIONAL SYSTEMS, 2007, : 143 - 148
  • [2] Robust gradient-based iterative learning control
    Owens, D. H.
    Hatonen, J.
    Daley, S.
    2007 IEEE INTERNATIONAL CONFERENCE ON NETWORKING, SENSING, AND CONTROL, VOLS 1 AND 2, 2007, : 163 - 168
  • [3] Parallelization of gradient-based iterative image reconstruction scheme
    Bartel, S
    Abdoulaev, G
    Hielscher, AH
    BIOMEDICAL TOPICAL MEETINGS, TECHNICAL DIGEST, 2000, 38 : 433 - 435
  • [4] Iterative Gradient-Based Shift Estimation: To Multiscale or Not to Multiscale?
    Rais, Martin
    Morel, Jean-Michel
    Facciolo, Gabriele
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2015, 2015, 9423 : 416 - 423
  • [5] Gradient-based iterative solutions for general matrix equations
    Xie, Li
    Yang, Huizhong
    Ding, Jie
    Ding, Feng
    2009 AMERICAN CONTROL CONFERENCE, VOLS 1-9, 2009, : 500 - 505
  • [6] OPEN-SET RECOGNITION WITH GRADIENT-BASED REPRESENTATIONS
    Lee, Jinsol
    AlRegib, Ghassan
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 469 - 473
  • [7] Gradient-based and multi-innovation gradient-based iterative algorithms for single-diode photovoltaic cell models
    Wang, Junwei
    Ji, Yan
    Liu, Haibo
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 1166 - 1171
  • [8] A Stochastic Gradient-Based Projection Algorithm for Distributed Constrained Optimization
    Zhang, Keke
    Gao, Shanfu
    Chen, Yingjue
    Zheng, Zuqing
    Lu, Qingguo
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT I, 2024, 14447 : 356 - 367
  • [9] Gradient-Based Iterative Learning Control for Decentralised Collaborative Tracking
    Chen, Shangcheng
    Freeman, Christopher
    Chu, Bing
    2018 EUROPEAN CONTROL CONFERENCE (ECC), 2018, : 721 - 726
  • [10] Fast and accurate frequency estimation with a gradient-based iterative algorithm
    Harbin Institute of Technology, Automatic Test and Control Institute, Harbin
    150001, China
    不详
    264209, China
    J Vib Shock, 14 (16-20):