Cross-domain Transfer Learning for Recognizing Professional Skills from Chinese Job Postings

被引：0

作者：

Xinhe Y. ^{[1
]}

Peng Y. ^{[2
]}

Yimin W. ^{[2
]}

机构：

[1] Library of Guilin University of Electronic Technology, Guilin

[2] School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin

来源：

Data Analysis and Knowledge Discovery | 2022年 / 6卷 / 2-3期

关键词：

Cross Domain Transfer Learning; Domain Adaptation; Professional Skill Words;

D O I：

10.11925/infotech.2096-3467.2021.0963

中图分类号：

学科分类号：

摘要：

[Objective] This paper analyzes the online job postings and identifies the demands of employers accurately, aiming to address the skill gaps between supply and demand in the labor market. [Methods] We proposed a model with cross-domain transfer learning to recognize professional skill words (CDTL-PSE). This task was treated as sequence tagging like named entity recognition or term extraction in CDTL-PSE. It also decomposed the SIGHAN corpus into three source domains. A domain adaptation layer was inserted between the Bi-LSTM and the CRF layers, which helped us transfer learning from each source domain to the target domain. Then, we used parameter transfer approach to train each sub-model. Finally, we obtained the prediction of label sequence by majority vote. [Results] On the self-built online recruitment data set, compared with the baseline method, the proposed model improved the F1 value by 0.91%, and reduced the labeled samples by about 50%. [Limitations] The interpretability of CDTL-PSE needs to be further improved. [Conclusions] CDTL-PSE can automatically extract words on professional skills, and effectively increase the labeled samples in the target domain. © 2022, Chinese Academy of Sciences. All rights reserved.

引用

页码：274 / 288

页数：14

共 42 条

[1] MyCOS Wang Boqing, Chen Yonghong, Chinese 4-Year College Graduates’ Employment Annual Report（2019）[M], (2019)
[2] Phaphuangwittayakul A, Saranwong S, Panyakaew S N, Et al., Analysis of Skill Demand in Thai Labor Market from Online Jobs Recruitments Websites, Proceedings of the 15th International Joint Conference on Computer Science and Software Engineering, pp. 1-5, (2018)
[3] Mauro A, Greco M, Grimaldi M, Et al., Human Resources for Big Data Professions: A Systematic Classification of Job Roles and Required Skill Sets, Information Processing & Management, 54, 5, pp. 807-817, (2018)
[4] Huang Z, Xu W, Yu K., Bidirectional LSTM-CRF Models for Sequence Tagging
[5] Cho H C, Okazaki N, Miwa M, Et al., Named Entity Recognition with Multiple Segment Representations, Information Processing & Management, 49, 4, pp. 954-965, (2013)
[6] Ronan C, Jason W, Leon B, Et al., Natural Language Processing (almost) from Scratch, The Journal of Machine Learning Research, 12, pp. 2493-2537, (2011)
[7] Lample G, Ballesteros M, Subramanian S, Et al., Neural Architectures for Named Entity Recognition, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 260-270, (2016)
[8] Peng N Y, Dredze M., Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 149-155, (2016)
[9] Feng X C, Feng X C, Qin B, Et al., Improving Low Resource Named Entity Recognition Using Cross-Lingual Knowledge Transfer, Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 4071-4077, (2018)
[10] Wang S L, Zhang Y, Che W X, Et al., Joint Extraction of Entities and Relations Based on a Novel Graph Scheme, Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 4461-4467, (2018)

← 1 2 3 4 5 →