Social Value Alignment in Large Language Models

被引:0
|
作者
Abbol, Giulio Antonio [1 ]
Marchesi, Serena [2 ]
Wykowska, Agnieszka [2 ]
Belpaeme, Tony [1 ]
机构
[1] Univ Ghent, Imec, IDLab AIRO, Ghent, Belgium
[2] S4HRI Ist Italiano Tecnol, Genoa, Italy
关键词
Values; Large Language Models; LLM; Alignment; MIND;
D O I
10.1007/978-3-031-58202-8_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) have demonstrated remarkable proficiency in text generation and display an apparent understanding of both physical and social aspects of the world. In this study, we look into the capabilities of LLMs to generate responses that align with human values. We focus on five prominent LLMs - GPT-3, GPT-4, PaLM-2, LLaMA-2 and BLOOM - and compare their generated responses with those provided by human participants. To evaluate the value alignment of LLMs, we presented domestic scenarios to the model and elicited a response with minimal prompting instructions. Human raters judged the responses on appropriateness and value alignment. The results revealed that GPT-3, 4 and PaLM-2 performed on par with human participants, displaying a notable level of value alignment in their generated responses. However, LLaMA-2 and BLOOM fell short in this aspect, indicating a possible divergence from human values. Furthermore, our findings indicate that the raters faced difficulty in distinguishing between responses generated by LLMs and those by humans, with raters exhibiting a preference for machine-generated responses in certain cases. These findings shed light on the capabilities of state-of-the-art LLMs to align with human values, but also allow us to speculate on whether these models could be value-aware. This research contributes to the ongoing exploration of LLMs' understanding of ethical considerations and provides insights into their potential for engaging in value-driven interactions.
引用
收藏
页码:83 / 97
页数:15
相关论文
共 50 条
  • [31] Large Language Models
    Cerf, Vinton G.
    COMMUNICATIONS OF THE ACM, 2023, 66 (08) : 7 - 7
  • [32] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
    Myers, Devon
    Mohawesh, Rami
    Chellaboina, Venkata Ishwarya
    Sathvik, Anantha Lakshmi
    Venkatesh, Praveen
    Ho, Yi-Hui
    Henshaw, Hanna
    Alhawawreh, Muna
    Berdik, David
    Jararweh, Yaser
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 1 - 26
  • [33] Understanding the Effect of Model Compression on Social Bias in Large Language Models
    Goncalves, Gustavo
    Strubell, Emma
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 2663 - 2675
  • [34] Large language models can outperform humans in social situational judgments
    Mittelstaedt, Justin M.
    Maier, Julia
    Goerke, Panja
    Zinn, Frank
    Hermes, Michael
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [35] Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
    Devon Myers
    Rami Mohawesh
    Venkata Ishwarya Chellaboina
    Anantha Lakshmi Sathvik
    Praveen Venkatesh
    Yi-Hui Ho
    Hanna Henshaw
    Muna Alhawawreh
    David Berdik
    Yaser Jararweh
    Cluster Computing, 2024, 27 : 1 - 26
  • [36] Understanding Sarcoidosis Using Large Language Models and Social Media Data
    Xi, Nan Miles
    Ji, Hong-Long
    Wang, Lin
    JOURNAL OF HEALTHCARE INFORMATICS RESEARCH, 2024,
  • [37] Understanding Social Reasoning in Language Models with Language Models
    Gandhi, Kanishk
    Franken, J. -Philipp
    Gerstenberg, Tobias
    Goodman, Noah D.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [38] BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models
    Zhao, Xueliang
    Huang, Xinting
    Fu, Tingchen
    Li, Qintong
    Gong, Shansan
    Liu, Lemao
    Bi, Wei
    Kong, Lingpeng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 7255 - 7279
  • [39] Assessing the Sociocultural Alignment of Large Language Models: An Empirical Study of Chinese-Speaking Populations
    Warmsley, Dana
    Xu, Jiejun
    Johnson, Samuel D.
    SOCIAL, CULTURAL, AND BEHAVIORAL MODELING, SBP-BRIMS 2024, 2024, 14972 : 246 - 256
  • [40] Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment
    DeCLaRe Lab, Singapore University of Technology and Design, Singapore
    arXiv, 1600,