Social Value Alignment in Large Language Models

被引:0
|
作者
Abbol, Giulio Antonio [1 ]
Marchesi, Serena [2 ]
Wykowska, Agnieszka [2 ]
Belpaeme, Tony [1 ]
机构
[1] Univ Ghent, Imec, IDLab AIRO, Ghent, Belgium
[2] S4HRI Ist Italiano Tecnol, Genoa, Italy
关键词
Values; Large Language Models; LLM; Alignment; MIND;
D O I
10.1007/978-3-031-58202-8_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large Language Models (LLMs) have demonstrated remarkable proficiency in text generation and display an apparent understanding of both physical and social aspects of the world. In this study, we look into the capabilities of LLMs to generate responses that align with human values. We focus on five prominent LLMs - GPT-3, GPT-4, PaLM-2, LLaMA-2 and BLOOM - and compare their generated responses with those provided by human participants. To evaluate the value alignment of LLMs, we presented domestic scenarios to the model and elicited a response with minimal prompting instructions. Human raters judged the responses on appropriateness and value alignment. The results revealed that GPT-3, 4 and PaLM-2 performed on par with human participants, displaying a notable level of value alignment in their generated responses. However, LLaMA-2 and BLOOM fell short in this aspect, indicating a possible divergence from human values. Furthermore, our findings indicate that the raters faced difficulty in distinguishing between responses generated by LLMs and those by humans, with raters exhibiting a preference for machine-generated responses in certain cases. These findings shed light on the capabilities of state-of-the-art LLMs to align with human values, but also allow us to speculate on whether these models could be value-aware. This research contributes to the ongoing exploration of LLMs' understanding of ethical considerations and provides insights into their potential for engaging in value-driven interactions.
引用
收藏
页码:83 / 97
页数:15
相关论文
共 50 条
  • [21] The potential of Large Language Models for social robots in special education
    Voultsiou, Evdokia
    Vrochidou, Eleni
    Moussiades, Lefteris
    Papakostas, George A.
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2025,
  • [22] AGE-RELATED VALUE ORIENTATIONS IN LARGE LANGUAGE MODELS (LLMS)
    Zhang, Xin
    Ren, Yuanyi
    Song, Guojie
    INNOVATION IN AGING, 2024, 8 : 1010 - 1010
  • [23] Are Large Language Models Consistent over Value-laden Questions?
    Moore, Jared
    Deshpande, Tanvi
    Yang, Diyi
    EMNLP 2024 - 2024 Conference on Empirical Methods in Natural Language Processing, Findings of EMNLP 2024, 2024, : 15185 - 15221
  • [24] Are Large Language Models Consistent over Value-laden Questions?
    Moore, Jared
    Deshpande, Tanvi
    Yang, Diyi
    arXiv,
  • [25] Expert of Experts Verification and Alignment (EVAL) Framework for Large Language Models Safety in Gastroenterology
    Mauro Giuffrè
    Kisung You
    Ziteng Pang
    Simone Kresevic
    Sunny Chung
    Ryan Chen
    Youngmin Ko
    Colleen Chan
    Theo Saarinen
    Milos Ajcevic
    Lory S. Crocè
    Guadalupe Garcia-Tsao
    Ian Gralnek
    Joseph J. Y. Sung
    Alan Barkun
    Loren Laine
    Jasjeet Sekhon
    Bradly Stadie
    Dennis L. Shung
    npj Digital Medicine, 8 (1)
  • [26] DistillSeq: A Framework for Safety Alignment Testing in Large Language Models using Knowledge Distillation
    Yang, Mingke
    Chen, Yuqi
    Liu, Yi
    Shi, Ling
    PROCEEDINGS OF THE 33RD ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2024, 2024, : 578 - 589
  • [27] AutoAlign: Fully Automatic and Effective Knowledge Graph Alignment Enabled by Large Language Models
    Zhang, Rui
    Su, Yixin
    Trisedya, Bayu Distiawan
    Zhao, Xiaoyan
    Yang, Min
    Cheng, Hong
    Qi, Jianzhong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (06) : 2357 - 2371
  • [28] Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment
    Leel, Janghwan
    Park, Seongmin
    Hong, Sukjin
    Ki, Minsoo
    Chang, Du-Seong
    Choi, Jungwook
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 11346 - 11364
  • [29] Large Language Models are Not Models of Natural Language: They are Corpus Models
    Veres, Csaba
    IEEE ACCESS, 2022, 10 : 61970 - 61979
  • [30] Large Language Models
    Vargas, Diego Collarana
    Katsamanis, Nassos
    ERCIM NEWS, 2024, (136): : 12 - 13