QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search

被引:0
|
作者
Xie, Jian [1 ]
Liang, Yidan [2 ]
Liu, Jingping [3 ]
Xiao, Yanghua [1 ]
Wu, Baohua [2 ]
Ni, Shenghua [2 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Data Sci, Shanghai, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
[3] East China Univ Sci & Technol, Sch Informat Sci & Engn, Shanghai, Peoples R China
关键词
Continual Pre-training; Query Understanding; Travel Domain Search;
D O I
10.1145/3580305.3599891
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In light of the success of the pre-trained language models (PLMs), continual pre-training of generic PLMs has been the paradigm of domain adaption. In this paper, we propose QUERT, A Continual Pre-trained Language Model for QUERy Understanding in Travel Domain Search. QUERT is jointly trained on four tailored pre-training tasks to the characteristics of query in travel domain search: Geography-aware Mask Prediction, Geohash Code Prediction, User Click Behavior Learning, and Phrase and Token Order Prediction. Performance improvement of downstream tasks and ablation experiment demonstrate the effectiveness of our proposed pre-training tasks. To be specific, the average performance of downstream tasks increases by 2.02% and 30.93% in supervised and unsupervised settings, respectively. To check on the improvement of QUERT to online business, we deploy QUERT and perform A/B testing on Fliggy APP. The feedback results show that QUERT increases the Unique Click-Through Rate and Page Click-Through Rate by 0.89% and 1.03% when applying QUERT as the encoder. Resources are available at https://github.com/hsaest/QUERT.
引用
收藏
页码:5282 / 5291
页数:10
相关论文
共 50 条
  • [21] Soft Language Clustering for Multilingual Model Pre-training
    Zeng, Jiali
    Jiang, Yufan
    Yin, Yongjing
    Jing, Yi
    Meng, Fandong
    Lin, Binghuai
    Cao, Yunbo
    Zhou, Jie
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 7021 - 7035
  • [22] VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding
    Xu, Hu
    Ghosh, Gargi
    Huang, Po-Yao
    Arora, Prahal
    Aminzadeh, Masoumeh
    Feichtenhofer, Christoph
    Metze, Florian
    Zettlemoyer, Luke
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4227 - 4239
  • [23] A Study into Pre-training Strategies for Spoken Language Understanding on Dysarthric Speech
    Wang, Pu
    BabaAli, Bagher
    Van Hamme, Hugo
    INTERSPEECH 2021, 2021, : 36 - 40
  • [24] Simultaneously Training and Compressing Vision-and-Language Pre-Training Model
    Qi, Qiaosong
    Zhang, Aixi
    Liao, Yue
    Sun, Wenyu
    Wang, Yongliang
    Li, Xiaobo
    Liu, Si
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8194 - 8203
  • [25] Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
    Lei, Chenyi
    Luo, Shixian
    Liu, Yong
    He, Wanggui
    Wang, Jiamang
    Wang, Guoxin
    Tang, Haihong
    Miao, Chunyan
    Li, Houqiang
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2567 - 2576
  • [26] Pre-training language model incorporating domain-specific heterogeneous knowledge into a unified representation
    Zhu, Hongyin
    Peng, Hao
    Lyu, Zhiheng
    Hou, Lei
    Li, Juanzi
    Xiao, Jinghui
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 215
  • [27] Multimodal Pre-training Method for Vision-language Understanding and Generation
    Liu T.-Y.
    Wu Z.-X.
    Chen J.-J.
    Jiang Y.-G.
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (05): : 2024 - 2034
  • [28] Unsupervised Domain Adaption Harnessing Vision-Language Pre-Training
    Zhou, Wenlve
    Zhou, Zhiheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (09) : 8201 - 8214
  • [29] SPEECH-LANGUAGE PRE-TRAINING FOR END-TO-END SPOKEN LANGUAGE UNDERSTANDING
    Qian, Yao
    Bianv, Ximo
    Shi, Yu
    Kanda, Naoyuki
    Shen, Leo
    Xiao, Zhen
    Zeng, Michael
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 7458 - 7462
  • [30] Conditional Embedding Pre-Training Language Model for Image Captioning
    Li, Pengfei
    Zhang, Min
    Lin, Peijie
    Wan, Jian
    Jiang, Ming
    NEURAL PROCESSING LETTERS, 2022, 54 (06) : 4987 - 5003