Evaluating Social Bias in Code Generation Models

被引:0
|
作者
Ling, Lin [1 ]
机构
[1] Concordia Univ, Montreal, PQ, Canada
关键词
Code Generation Models; Social Bias; AI Ethics;
D O I
10.1145/3663529.3664462
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The functional correctness of Code Generation Models (CLMs) has been well-studied, but their social bias has not. This study aims to fill this gap by creating an evaluation set for human-centered tasks and empirically assessing social bias in CLMs. We introduce a novel evaluation framework to assess biases in CLM-generated code, using differential testing to determine if the code exhibits biases towards specific demographic groups in social issues. Our core contributions are (1) a dataset for evaluating social problems and (2) a testing framework to quantify CLM fairness in code generation, promoting ethical AI development.
引用
收藏
页码:695 / 697
页数:3
相关论文
共 50 条
  • [1] CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
    Zhou, Shuyan
    Alon, Uri
    Agarwal, Sumit
    Neubig, Graham
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13921 - 13937
  • [2] Exploring and Evaluating Personalized Models for Code Generation
    Zlotchevski, Andrei
    Drain, Dawn
    Svyatkovskiy, Alexey
    Clement, Colin B.
    Sundaresan, Neel
    Tufano, Michele
    PROCEEDINGS OF THE 30TH ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2022, 2022, : 1500 - 1508
  • [3] Exploring and Evaluating Personalized Models for Code Generation
    Zlotchevski, Andrei
    Drain, Dawn
    Svyatkovskiy, Alexey
    Clement, Colin
    Sundaresan, Neel
    Tufano, Michele
    arXiv, 2022,
  • [4] Framework for evaluating code generation ability of large language models
    Yeo, Sangyeop
    Ma, Yu-Seung
    Kim, Sang Cheol
    Jun, Hyungkook
    Kim, Taeho
    ETRI JOURNAL, 2024, 46 (01) : 106 - 117
  • [5] Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models
    Riddell, Martin
    Ni, Ansong
    Cohan, Arman
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 14116 - 14137
  • [6] Invited Paper: VerilogEval: Evaluating Large Language Models for Verilog Code Generation
    Liu, Mingjie
    Pinckney, Nathaniel
    Khailany, Brucek
    Ren, Haoxing
    2023 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2023,
  • [7] CodeScore: Evaluating Code Generation by Learning Code Execution
    Dong, Yihong
    Ding, Jiazheng
    Jiang, Xue
    Li, Ge
    Li, Zhuo
    Jin, Zhi
    ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2025, 34 (03)
  • [8] JavaBench: A Benchmark of Object-Oriented Code Generation for Evaluating Large Language Models
    Cao, Jialun
    Chen, Zhiyong
    Wu, Jiarong
    Cheung, Shing-Chi
    Xu, Chang
    Proceedings - 2024 39th ACM/IEEE International Conference on Automated Software Engineering, ASE 2024, : 870 - 882
  • [9] VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation
    Vijayaraghavan, Prashanth
    Shi, Luyao
    Ambrogio, Stefano
    Mackin, Charles
    Nitsure, Apoorva
    Beymer, David
    Degan, Ehsan
    2024 IEEE LLM AIDED DESIGN WORKSHOP, LAD 2024, 2024,
  • [10] Evaluating the Performance of Code Generation Models for Solving Parsons Problems With Small Prompt Variations
    Reeves, Brent
    Sarsa, Sami
    Prather, James
    Denny, Paul
    Becker, Brett A.
    Hellas, Arto
    Kimmel, Bailey
    Powell, Garrett
    Leinonen, Juho
    PROCEEDINGS OF THE 2023 CONFERENCE ON INNOVATION AND TECHNOLOGY IN COMPUTER SCIENCE EDUCATION, ITICSE 2023, VOL 1, 2023, : 299 - 305