KuaiRec: A Fully-observed Dataset and Insights for Evaluating Recommender Systems

被引：42

作者：

Gao, Chongming ^{[1
]}

Li, Shijun ^{[1
]}

Lei, Wenqiang ^{[2
]}

Chen, Jiawei ^{[3
]}

Li, Biao ^{[4
]}

Jiang, Peng ^{[4
]}

He, Xiangnan ^{[1
]}

Mao, Jiaxin ^{[5
]}

Chua, Tat-Seng ^{[6
]}

机构：

[1] Univ Sci & Technol China, Hefei, Anhui, Peoples R China

[2] Sichuan Univ, Chengdu, Sichuan, Peoples R China

[3] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China

[4] Kuaishou Technol Co Ltd, Beijing, Peoples R China

[5] Renmin Univ China, Beijing, Peoples R China

[6] Natl Univ Singapore, Singapore, Singapore

来源：

PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022 | 2022年

基金：

中国国家自然科学基金;

关键词：

Fully-observed data; Recommendation; Evaluation; User simulation;

D O I：

10.1145/3511808.3557220

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The progress of recommender systems is hampered mainly by evaluation as it requires real-time interactions between humans and systems, which is too laborious and expensive. This issue is usually approached by utilizing the interaction history to conduct offline evaluation. However, existing datasets of user-item interactions are partially observed, leaving it unclear how and to what extent the missing interactions will influence the evaluation. To answer this question, we collect a fully-observed dataset from Kuaishou's online environment, where almost all 1, 411 users have been exposed to all 3, 327 items. To the best of our knowledge, this is the first real-world fully-observed data with millions of user-item interactions. With this unique dataset, we conduct a preliminary analysis of how the two factors - data density and exposure bias - affect the evaluation results of multi-round conversational recommendation. Our main discoveries are that the performance ranking of different methods varies with the two factors, and this effect can only be alleviated in certain cases by estimating missing interactions for user simulation. This demonstrates the necessity of the fully-observed dataset. We release the dataset and the pipeline implementation for evaluation at https://kuairec.com.

引用

页码：540 / 550

页数：11

共 50 条

[31] Evaluating preference-based feedback in recommender systems
McGinty, L
Smyth, B
ARTIFICIAL INTELLIGENCE AND COGNITIVE SCIENCE, PROCEEDINGS, 2002, 2464 : 209 - 214
[32] Evaluating Conversational Recommender Systems via User Simulation
Zhang, Shuo
Balog, Krisztian
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1512 - 1520
[33] Evaluating Strategies for Selecting Test Datasets in Recommender Systems
Pajuelo-Holguera, Francisco
Gomez-Pulido, Juan A.
Ortega, Fernando
HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2019, 2019, 11734 : 243 - 253
[34] Evaluating Interface Variants on Personality Acquisition for Recommender Systems
Dunn, Greg
Wiersema, Jurgen
Ham, Jaap
Aroyo, Lora
USER MODELING, ADAPTATION, AND PERSONALIZATION, PROCEEDINGS, 2009, 5535 : 259 - +
[35] Evaluating the Relative Performance of Collaborative Filtering Recommender Systems
Pampin, Humberto Jesus Corona
Jerbi, Houssem
O'Mahony, Michael P.
JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2015, 21 (13) : 1849 - 1868
[36] Evaluating the Evaluations of Code Recommender Systems: A Reality Check
Proksch, Sebastian
Amann, Sven
Nadi, Sarah
Mezini, Mira
2016 31ST IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), 2016, : 111 - 121
[37] Evaluating ADHD Users' Experience with Commercial Recommender Systems
Khan, Sushmita
Knijnenburg, Bart
Dixon, Emma
COMPANION PROCEEDINGS OF 2024 29TH ANNUAL CONFERENCE ON INTELLIGENT USER INTERFACES, IUI 2024 COMPANION, 2024, : 84 - 88
[38] Bounding System-Induced Biases in Recommender Systems with a Randomized Dataset
Liu, Dugang
Cheng, Pengxiang
Lin, Zinan
Zhang, Xiaolian
Dong, Zhenhua
Zhang, Rui
He, Xiuqiang
Pan, Weike
Ming, Zhong
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (04)
[39] Effects of Binary Similarity Metrics in Recommender Systems for Jester Jokes Dataset
Senyurek, Edip
Kevric, Jasmin
Lecture Notes in Networks and Systems, 2024, 1070 LNNS : 404 - 412
[40] Bias and Unfairness of Collaborative Filtering Based Recommender Systems in MovieLens Dataset
Gonzalez, Alvaro
Ortega, Fernando
Perez-Lopez, Diego
Alonso, Santiago
IEEE ACCESS, 2022, 10 : 68429 - 68439

← 1 2 3 4 5 →