Visual Attentional Network and Learning Method for Object Search and Recognition

被引:0
|
作者
Lü J. [1 ]
Luo F. [1 ]
Yuan Z. [1 ]
机构
[1] School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an
关键词
Attentional model; Fixation strategy; Object detection; Reinforcement learning;
D O I
10.3901/JME.2019.11.123
中图分类号
学科分类号
摘要
A recurrent visual network is proposed to search and recognize an object simultaneously. The network can automatically select a sequence of local observations, and accurately localize and recognize objects by fusing those local detail appearance and rough context visual information. The method is more efficient than other methods with sliding windows or convolution on a whole image. Besides, a hybrid loss function is proposed to learn parameters of the multi-task network end-to-end. Especially, The combination of stochastic and object-awareness strategy is imported into visual fixation loss, which is beneficial to mine more abundant context and ensure fixation point close to object as fast as possible. A real-world dataset is built to verify the capacity of the method in searching and recognizing the object of interest including those small ones. Experiments illustrate that the method can predict an accurate bounding box for a visual object, and achieve higher searching speed. The source code will be opened to verify and analyze the method. © 2019 Journal of Mechanical Engineering.
引用
收藏
页码:123 / 130
页数:7
相关论文
共 27 条
  • [21] Xu K., Ba J., Kiros R., Et al., Show, Attend and Tell: Neural Image Caption Generation with Visual attention, International Conference on Machine Learning, pp. 2048-2057, (2015)
  • [22] Bazzani L., Larochelle H., Murino V., Et al., Learning Attentional Policies for Tracking and Recognition in Video with Deep networks, International Conference on Machine Learning, pp. 937-944, (2011)
  • [23] Gregor K., Danihelka I., Graves A., Et al., DRAW: A Recurrent Neural Network for Image generation, International Conference on Machine Learning, pp. 1462-1471, (2015)
  • [24] Caicedo J.C., Lazebnik S., Active object localization with deep reinforcement learning, IEEE International Conference on Computer Vision (ICCV), pp. 2488-2496, (2016)
  • [25] Williams R.J., Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning, 8, 3, pp. 229-256, (1992)
  • [26] Sutton R.S., Mcallester D.A., Singh S.P., Et al., Policy gradient methods for reinforcement learning with function approximation, Advances in Neural Information Processing Systems, 1, pp. 1057-1063, (1999)
  • [27] Lecun Y., Boser B.E., Denker J.S., Et al., Backpropagation applied to handwritten zip code recognition, Neural Computation, 1, 4, pp. 541-551, (1989)