CaseEncoder: A Knowledge-enhanced Pre-trained Model for Legal Case Encoding

被引：0

作者：

Ma, Yixiao ^{[1
]}

Wu, Yueyue ^{[2
,3
,4
]}

Su, Weihang ^{[2
,3
,4
]}

Ai, Qingyao ^{[2
,3
,4
]}

Liu, Yiqun ^{[2
,3
,4
]}

机构：

[1] Huawei Cloud BU, Shenzhen, Guangdong, Peoples R China

[2] Quan Cheng Lab, Nanjing, Peoples R China

[3] Tsinghua Univ, Inst Internet Judiciary, Beijing, Peoples R China

[4] Tsinghua Univ, DCST, Beijing, Peoples R China

来源：

2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023 | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Legal case retrieval is a critical process for modern legal information systems. While recent studies have utilized pre-trained language models (PLMs) based on the general domain self-supervised pre-training paradigm to build models for legal case retrieval, there are limitations in using general domain PLMs as backbones. Specifically, these models may not fully capture the underlying legal features in legal case documents. To address this issue, we propose CaseEncoder, a legal document encoder that leverages fine-grained legal knowledge in both the data sampling and pre-training phases. In the data sampling phase, we enhance the quality of the training data by utilizing fine-grained law article information to guide the selection of positive and negative examples. In the pre-training phase, we design legal-specific pre-training tasks that align with the judging criteria of relevant legal cases. Based on these tasks, we introduce an innovative loss function called Biased Circle Loss to enhance the model's ability to recognize case relevance in fine grains. Experimental results on multiple benchmarks demonstrate that CaseEncoder significantly outperforms both existing general pre-training models and legal-specific pre-training models in zero-shot legal case retrieval. The source code of CaseEncoder can be found at https://github.com/myx666/CaseEncoder.

引用

页码：7134 / 7143

页数：10

共 50 条

[31] NMT Enhancement based on Knowledge Graph Mining with Pre-trained Language Model
Yang, Hao
Qin, Ying
Deng, Yao
Wang, Minghan
2020 22ND INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): DIGITAL SECURITY GLOBAL AGENDA FOR SAFE SOCIETY!, 2020, : 185 - 189
[32] Vietnamese Sentence Paraphrase Identification using Pre-trained Model and Linguistic Knowledge
Dien Dinh
Nguyen Le Thanh
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (08) : 796 - 806
[33] Probing Pre-Trained Language Models for Disease Knowledge
Alghanmi, Israa
Espinosa-Anke, Luis
Schockaert, Steven
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 3023 - 3033
[34] Dynamic Knowledge Distillation for Pre-trained Language Models
Li, Lei
Lin, Yankai
Ren, Shuhuai
Li, Peng
Zhou, Jie
Sun, Xu
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 379 - 389
[35] Commonsense Knowledge Transfer for Pre-trained Language Models
Zhou, Wangchunshu
Le Bras, Ronan
Choi, Yejin
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 5946 - 5960
[36] AP-BERT: enhanced pre-trained model through average pooling
Zhao, Shuai
Zhang, Tianyu
Hu, Man
Chang, Wen
You, Fucheng
APPLIED INTELLIGENCE, 2022, 52 (14) : 15929 - 15937
[37] Enhanced Pre-Trained Xception Model Transfer Learned for Breast Cancer Detection
Joshi, Shubhangi A.
Bongale, Anupkumar M.
Olsson, P. Olof
Urolagin, Siddhaling
Dharrao, Deepak
Bongale, Arunkumar
COMPUTATION, 2023, 11 (03)
[38] AP-BERT: enhanced pre-trained model through average pooling
Shuai Zhao
Tianyu Zhang
Man Hu
Wen Chang
Fucheng You
Applied Intelligence, 2022, 52 : 15929 - 15937
[39] Vision Enhanced Generative Pre-trained Language Model for Multimodal Sentence Summarization
Jing, Liqiang
Li, Yiren
Xu, Junhao
Yu, Yongcan
Shen, Pei
Song, Xuemeng
MACHINE INTELLIGENCE RESEARCH, 2023, 20 (02) : 289 - 298
[40] Exploring Transfer Learning for Enhanced Seed Classification: Pre-trained Xception Model
Gulzar, Yonis
Unal, Zeynep
Ayoub, Shahnawaz
Reegu, Faheem Ahmad
15TH INTERNATIONAL CONGRESS ON AGRICULTURAL MECHANIZATION AND ENERGY IN AGRICULTURE, ANKAGENG 2023, 2024, 458 : 137 - 147

← 1 2 3 4 5 →