Survey of Multi-Task Learning

被引:0
|
作者
Zhang Y. [1 ]
Liu J.-W. [1 ]
Zuo X. [1 ]
机构
[1] Department of Automation, China University of Petroleum, Beijing
来源
| 1600年 / Science Press卷 / 43期
关键词
Bayesian generative model of multi-task learning; Deep multi-task learning via deep neural network; Discriminant approach of multi-task learning; Information transfer; Multi-task learning; Similarity of tasks;
D O I
10.11897/SP.J.1016.2020.01340
中图分类号
学科分类号
摘要
With the development of artificial intelligence technology such as image processing and speech recognition, many learning methods, especially those using deep learning frameworks, have achieved excellent performance and greatly improved accuracy and speed, but the problems are also obvious, if these learning methods want to achieve a stable learning effect, they often need to use a large number of labeled data to train adequately. Otherwise, there will be an under-fitting situation which will lead to the decline of learning performance. Therefore, with the increase of task complexity and data scale, higher requirements are put forward for the quantity and quality of manual labeling data, resulting in the increase of labeling cost and difficulty. At the same time, the independent learning of single task often ignores the experience information from other tasks, which leads to redundant training and waste of learning resources, and also limits the improvement of its performance. In order to alleviate these problems, the multi-task learning method, which belongs to the category of transfer learning, has gradually attracted the attention of researchers. Unlike single-task learning, which only uses sample information of a single task, multi-task learning assumes that there is a certain similarity between the data distribution of different tasks. On this basis, the relationship between tasks is established through joint training and optimization. This training mode fully promotes information exchange between tasks and achieves the goal of mutual learning. Especially under the condition that the sample size of each task is limited, each task can get some inspiration from other tasks. With the help of information transfer in the learning process, the data of other tasks can be indirectly utilized. Thus, the dependence on a large number of labeled data is alleviated, and the goal of improving the performance of task learning is also achieved. Under this background, this paper first introduces the concept of related tasks, and describes their characteristics one by one after classifying the types of related tasks according to their functions. Then, according to the data processing mode and task relationship modeling process, the current mainstream algorithms are divided into two categories: structured multi-task learning algorithm and deep multi-task learning algorithm. The structured multi-task learning algorithm adopts linear model, which can directly assume the structure of the data and express the task relationship with the original annotation features. At the same time, it can be subdivided into two different structures based on task level and feature level according to the different learning objects. Each structure has two implementation means: discriminant method and generative method. Different from the modeling process of structured multi-task learning algorithm, deep multi-task learning algorithm uses the deep information abstracted by multi-layer features to describe the task relationship, and achieves the goal of information sharing by processing the parameters in the specific network layer. Then, taking two kinds of algorithms as the main line, this paper analyzed the structural assumptions, implementation approaches, advantages and disadvantages of different modeling methods and the relationship between them in detail. Finally, this paper summarizes the criteria for identifying the similarity and compactness between tasks, and the effectiveness and intrinsic causes of multi-task mechanism are also analyzed, then the characteristics of multi-task information migration are expounded from the perspectives of inductive bias and dynamic solution. © 2020, Science Press. All right reserved.
引用
收藏
页码:1340 / 1378
页数:38
相关论文
共 193 条
  • [1] Caruana R., Learning many related tasks at the same time with backpropagation, Proceedings of the 8th International Conference on Neural Information Processing Systems, pp. 657-664, (1994)
  • [2] Thrun S., Discovering structure in multiple learning tasks: The TC algorithm, Proceedings of the 13th International Conference on Machine Learning, pp. 489-497, (1996)
  • [3] Schlittgen R., Analysis of incomplete multivariate data, Computational Statistics & Data Analysis, 30, 4, pp. 478-479, (1999)
  • [4] Caruana R., Multitask learning, Machine Learning, 28, 1, pp. 41-75, (1997)
  • [5] Kienzle W, Chellapilla K., Personalized handwriting recognition via biased regularization, Proceedings of the 23rd International Conference on Machine Learning, pp. 457-464, (2006)
  • [6] Caruana R, Sa V R D., Promoting poor features to supervisors: Some inputs work better as outputs, Proceedings of the Advances in Neural Information Processing Systems, pp. 389-395, (1997)
  • [7] Ganin Y, Ustinova E, Ajakan H, Et al., Domain-adversarial training of neural networks, Journal of Machine Learning Research, 17, 59, pp. 1-35, (2016)
  • [8] Ganin Y, Lempitsky V., Unsupervised domain adaptation by back propagation, Proceedings of the 32nd International Conference on Machine Learning, pp. 1-9, (2015)
  • [9] Shinohara Y., Adversarial multi-task learning of deep neural networks for robust speech recognition, Proceedings of the INTERSPEECH, pp. 2369-2372, (2016)
  • [10] Cheng H, Fang H, Ostendorf M., Open-domain name error detection using a multi-task RNN, Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 737-746, (2015)