PEPT: Expert Finding Meets Personalized Pre-Training

被引：0

作者：

Peng, Qiyao ^{[1
]}

Xu, Hongyan ^{[2
]}

Wang, Yinghui ^{[3
]}

Liu, Hongtao ^{[4
]}

Huo, Cuiying ^{[5
]}

Wang, Wenjun ^{[5
,6
]}

机构：

[1] Tianjin Univ, Sch New Media & Commun, Tianjin, Peoples R China

[2] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China

[3] Beijing Inst Control & Elect Technol, Key Lab Informat Syst & Technol, Beijing, Peoples R China

[4] Du Xiaoman Technol, Beijing, Peoples R China

[5] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China

[6] Hainan Trop Ocean Univ, YazhouBay Innovat Inst, Hainan, Peoples R China

来源：

ACM TRANSACTIONS ON INFORMATION SYSTEMS | 2024年 / 43卷 / 01期

关键词：

Contrastive Learning;

D O I：

10.1145/3690380

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Finding experts is essential in Community Question Answering (CQA) platforms as it enables the effective routing of questions to potential users who can provide relevant answers. The key is to personalized learning expert representations based on their historical answered questions, and accurately matching them with target questions. Recently, the applications of Pre-Trained Language Models (PLMs) have gained significant attraction due to their impressive capability to comprehend textual data, and are widespread used across various domains. There have been some preliminary works exploring the usability of PLMs in expert finding, such as pre-training expert or question representations. However, these models usually learn pure text representations of experts from histories, disregarding personalized and fine-grained expert modeling. For alleviating this, we present a personalized pre-training and fine-tuning paradigm, which could effectively learn expert interest and expertise simultaneously. Specifically, in our pre-training framework, we integrate historical answered questions of one expert with one target question, and regard it as a candidate-aware expert-level input unit. Then, we fuse expert IDs into the pre-training for guiding the model to model personalized expert representations, which can help capture the unique characteristics and expertise of each individual expert. Additionally, in our pre-training task, we design (1) a question-level masked language model task to learn the relatedness between histories, enabling the modeling of question-level expert interest; (2) a vote-oriented task to capture question-level expert expertise by predicting the vote score the expert would receive. Through our pre-training framework and tasks, our approach could holistically learn expert representations including interests and expertise. Our method has been extensively evaluated on six real-world CQA datasets, and the experimental results consistently demonstrate the superiority of our approach over competitive baseline methods.

引用

页数：26

共 50 条

[41] Event Camera Data Pre-training
Yang, Yan
Pan, Liyuan
Liu, Liu
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 10665 - 10675
[42] Quality Diversity for Visual Pre-Training
Chavhan, Ruchika
Gouk, Henry
Li, Da
Hospedales, Timothy
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5361 - 5371
[43] Pre-training Methods in Information Retrieval
Fan, Yixing
Xie, Xiaohui
Cai, Yinqiong
Chen, Jia
Ma, Xinyu
Li, Xiangsheng
Zhang, Ruqing
Guo, Jiafeng
FOUNDATIONS AND TRENDS IN INFORMATION RETRIEVAL, 2022, 16 (03): : 178 - 317
[44] Pre-training in Medical Data: A Survey
Qiu, Yixuan
Lin, Feng
Chen, Weitong
Xu, Miao
MACHINE INTELLIGENCE RESEARCH, 2023, 20 (02) : 147 - 179
[45] Pre-Training Without Natural Images
Hirokatsu Kataoka
Kazushige Okayasu
Asato Matsumoto
Eisuke Yamagata
Ryosuke Yamada
Nakamasa Inoue
Akio Nakamura
Yutaka Satoh
International Journal of Computer Vision, 2022, 130 : 990 - 1007
[46] Structure-inducing pre-training
Matthew B. A. McDermott
Brendan Yap
Peter Szolovits
Marinka Zitnik
Nature Machine Intelligence, 2023, 5 : 612 - 621
[47] Pre-training Universal Language Representation
Li, Yian
Zhao, Hai
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 5122 - 5133
[48] Unsupervised Pre-Training for Voice Activation
Kolesau, Aliaksei
Sesok, Dmitrij
APPLIED SCIENCES-BASEL, 2020, 10 (23): : 1 - 13
[49] Recyclable Tuning for Continual Pre-training
Qin, Yujia
Qian, Cheng
Han, Xu
Lin, Yankai
Wang, Huadong
Xie, Ruobing
Li, Zhiyuan
Sun, Maosong
Zhou, Jie
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 11403 - 11426
[50] Automated Commit Intelligence by Pre-training
Liu, Shangqing
Li, Yanzhou
Xie, Xiaofei
Ma, Wei
Meng, Guozhu
Li, Yang
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2024, 33 (08)

← 1 2 3 4 5 →