Chinese Word Segmentation Method on the Basis of Bidirectional Long-Short Term Memory Model

被引:3
|
作者
Zhang H.-G. [1 ]
Li H. [1 ]
机构
[1] School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing
来源
| 2017年 / South China University of Technology卷 / 45期
关键词
Chinese word segmentation; Deep leaning; Long-short term memory; Neural network;
D O I
10.3969/j.issn.1000-565X.2017.03.009
中图分类号
学科分类号
摘要
Chinese word segmentation is one of the fundamental technologies of Chinese natural language processing. At present, most conventional Chinese word segmentation methods rely on feature engineering, which requires intensive labor to verify the effectiveness. With the rapid development of deep learning, it becomes realistic to learn features automatically by using neural network. In this paper, on the basis of bidirectional long short-term memory (BLSTM) model, a novel Chinese word segmentation method is proposed. In this method, Chinese characters are represented into embedding vectors from a large-scale corpus, and then the vectors are applied to BLSTM model for segmentation. It is found from the experiments without feature engineering that the proposed method is of high performance in Chinese word segmentation on simplified Chinese datasets(PKU, MSRA and CTB) and traditional Chinese dataset(HKCityU). © 2017, Editorial Department, Journal of South China University of Technology. All right reserved.
引用
收藏
页码:61 / 67
页数:6
相关论文
共 23 条
  • [11] Santos C.N., Xiang B., Zhou B., Classifying relations by ranking with convolutional neural networks, Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 626-634, (2015)
  • [12] Schuster M., Paliwal K., Bidirectional recurrent neural networks, IEEE Transactions on Signal Processing, 45, 11, pp. 2673-2681, (1997)
  • [13] Hochreiter S., Schmidhuber J., Long short-term memory, Neural Computation, 9, 8, pp. 1735-1780, (1997)
  • [14] Liu P., Qiu X., Chen X., Et al., Multi-timescale long short-term memory neural network for modelling sentences and documents, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2326-2335, (2015)
  • [15] Sutskever I., Vinvals O., Le Q.V., Sequence to sequence learning with neural networks, Advances in Neural Information Processing Systems, 4, pp. 3104-3112, (2014)
  • [16] Sundermeyer M., Schluter R., Ney H., LSTM neural networks for language modeling, Proceedings of 13th Annual Conference of the international Speech Communication Association, pp. 194-197, (2012)
  • [17] Ma X., Hovy E., End-to-end sequence labeling via bi-directional lstm-cnns-crf, arXiv preprint arXiv: 1603.01354, (2016)
  • [18] Wang C., Yang H., Bartz C., Et al., Image captioning with deep bidirectional lstms, Proceedings of the 2016 ACM on Multimedia Conference, pp. 988-997, (2016)
  • [19] Srivastava N., Hinton G., Krizhevsky A., Et al., Dropout: a simple way to prevent neural networks from overfitting, The Journal of Machine Learning Research, 15, 1, pp. 1929-1958, (2014)
  • [20] Mikolov T., Chen K., Corrado G., Et al., Efficient estimation of word representations in vector space, arXiv preprint arXiv: 1301.3781, (2013)