Cross-domain Transfer Learning for Recognizing Professional Skills from Chinese Job Postings

被引:0
|
作者
Xinhe Y. [1 ]
Peng Y. [2 ]
Yimin W. [2 ]
机构
[1] Library of Guilin University of Electronic Technology, Guilin
[2] School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin
关键词
Cross Domain Transfer Learning; Domain Adaptation; Professional Skill Words;
D O I
10.11925/infotech.2096-3467.2021.0963
中图分类号
学科分类号
摘要
[Objective] This paper analyzes the online job postings and identifies the demands of employers accurately, aiming to address the skill gaps between supply and demand in the labor market. [Methods] We proposed a model with cross-domain transfer learning to recognize professional skill words (CDTL-PSE). This task was treated as sequence tagging like named entity recognition or term extraction in CDTL-PSE. It also decomposed the SIGHAN corpus into three source domains. A domain adaptation layer was inserted between the Bi-LSTM and the CRF layers, which helped us transfer learning from each source domain to the target domain. Then, we used parameter transfer approach to train each sub-model. Finally, we obtained the prediction of label sequence by majority vote. [Results] On the self-built online recruitment data set, compared with the baseline method, the proposed model improved the F1 value by 0.91%, and reduced the labeled samples by about 50%. [Limitations] The interpretability of CDTL-PSE needs to be further improved. [Conclusions] CDTL-PSE can automatically extract words on professional skills, and effectively increase the labeled samples in the target domain. © 2022, Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:274 / 288
页数:14
相关论文
共 42 条
  • [1] MyCOS Wang Boqing, Chen Yonghong, Chinese 4-Year College Graduates’ Employment Annual Report(2019)[M], (2019)
  • [2] Phaphuangwittayakul A, Saranwong S, Panyakaew S N, Et al., Analysis of Skill Demand in Thai Labor Market from Online Jobs Recruitments Websites, Proceedings of the 15th International Joint Conference on Computer Science and Software Engineering, pp. 1-5, (2018)
  • [3] Mauro A, Greco M, Grimaldi M, Et al., Human Resources for Big Data Professions: A Systematic Classification of Job Roles and Required Skill Sets, Information Processing & Management, 54, 5, pp. 807-817, (2018)
  • [4] Huang Z, Xu W, Yu K., Bidirectional LSTM-CRF Models for Sequence Tagging
  • [5] Cho H C, Okazaki N, Miwa M, Et al., Named Entity Recognition with Multiple Segment Representations, Information Processing & Management, 49, 4, pp. 954-965, (2013)
  • [6] Ronan C, Jason W, Leon B, Et al., Natural Language Processing (almost) from Scratch, The Journal of Machine Learning Research, 12, pp. 2493-2537, (2011)
  • [7] Lample G, Ballesteros M, Subramanian S, Et al., Neural Architectures for Named Entity Recognition, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 260-270, (2016)
  • [8] Peng N Y, Dredze M., Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 149-155, (2016)
  • [9] Feng X C, Feng X C, Qin B, Et al., Improving Low Resource Named Entity Recognition Using Cross-Lingual Knowledge Transfer, Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 4071-4077, (2018)
  • [10] Wang S L, Zhang Y, Che W X, Et al., Joint Extraction of Entities and Relations Based on a Novel Graph Scheme, Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 4461-4467, (2018)