Prompt tuning discriminative language models for hierarchical text classification

被引：0

作者：

du Toit, Jaco ^{[1
,2
]}

Dunaiski, Marcel ^{[1
,2
]}

机构：

[1] Stellenbosch Univ, Dept Math Sci, Comp Sci Div, Stellenbosch, South Africa

[2] Stellenbosch Univ, Sch Data Sci & Computat Thinking, Stellenbosch, South Africa

来源：

NATURAL LANGUAGE PROCESSING | 2024年

关键词：

Large language models; discriminative language models; hierarchical text classification; prompt tuning;

D O I：

10.1017/nlp.2024.51

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Hierarchical text classification (HTC) is a natural language processing task which aims to categorise a text document into a set of classes from a hierarchical class structure. Recent approaches to solve HTC tasks focus on leveraging pre-trained language models (PLMs) and the hierarchical class structure by allowing these components to interact in various ways. Specifically, the Hierarchy-aware Prompt Tuning (HPT) method has proven to be effective in applying the prompt tuning paradigm to Bidirectional Encoder Representations from Transformers (BERT) models for HTC tasks. Prompt tuning aims to reduce the gap between the pre-training and fine-tuning phases by transforming the downstream task into the pre-training task of the PLM. Discriminative PLMs, which use a replaced token detection (RTD) pre-training task, have also shown to perform better on flat text classification tasks when using prompt tuning instead of vanilla fine-tuning. In this paper, we propose the Hierarchy-aware Prompt Tuning for Discriminative PLMs (HPTD) approach which injects the HTC task into the RTD task used to pre-train discriminative PLMs. Furthermore, we make several improvements to the prompt tuning approach of discriminative PLMs that enable HTC tasks to scale to much larger hierarchical class structures. Through comprehensive experiments, we show that our method is robust and outperforms current state-of-the-art approaches on two out of three HTC benchmark datasets.

引用

页数：18

共 50 条

[31] Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints
Lu, Albert
Zhang, Hongxin
Zhang, Yanzhe
Wang, Xuezhi
Yang, Diyi
17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 1982 - 2008
[32] Hierarchical text classification
Pulijala, AK
Gauch, S
ISAS/CITSA 2004: International Conference on Cybernetics and Information Technologies, Systems and Applications and 10th International Conference on Information Systems Analysis and Synthesis, Vol 1, Proceedings: COMMUNICATIONS, INFORMATION TECHNOLOGIES AND COMPUTING, 2004, : 257 - 262
[33] Learning Hierarchical Prompt with Structured Linguistic Knowledge for Vision-Language Models
Wang, Yubin
Jiang, Xinyang
Cheng, De
Li, Dongsheng
Zhao, Cairong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 5749 - 5757
[34] Discriminative features for text document classification
K. Torkkola
Formal Pattern Analysis & Applications, 2004, 6 : 301 - 308
[35] Bayesian network models for hierarchical text classification from a thesaurus
de Campos, Luis M.
Romero, Alfonso E.
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2009, 50 (07) : 932 - 944
[36] Prompt-Ladder: Memory-efficient prompt tuning for vision-language models on edge devices
Cai, Siqi
Liu, Xuan
Yuan, Jingling
Zhou, Qihua
PATTERN RECOGNITION, 2025, 163
[37] Discriminative features for text document classification
Torkkola, K
PATTERN ANALYSIS AND APPLICATIONS, 2003, 6 (04) : 301 - 308
[38] KPT plus plus : Refined knowledgeable prompt tuning for few-shot text classification
Ni, Shiwen
Kao, Hung-Yu
KNOWLEDGE-BASED SYSTEMS, 2023, 274
[39] REKP: Refined External Knowledge into Prompt-Tuning for Few-Shot Text Classification
Dang, Yuzhuo
Chen, Weijie
Zhang, Xin
Chen, Honghui
MATHEMATICS, 2023, 11 (23)
[40] ENHANCING CLASS UNDERSTANDING VIA PROMPT-TUNING FOR ZERO-SHOT TEXT CLASSIFICATION
Dan, Yuhao
Zhou, Jie
Chen, Qin
Bai, Qingchun
He, Liang
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4303 - 4307

← 1 2 3 4 5 →