Text Detection of Food Labels Based on Semantic Segmentation

被引:0
|
作者
Tian X. [1 ,2 ]
Wang Z. [1 ]
Wang J. [1 ,2 ]
机构
[1] School of Information Science and Technology, Beijing Forestry University, Beijing
[2] Engineering Research Center for Forestry-oriented Intelligent Information Processing of National Forestry and Grassland Administration, Beijing
关键词
Food labels; Semantic segmentation; Text detection; Text recognition;
D O I
10.6041/j.issn.1000-1298.2020.08.037
中图分类号
学科分类号
摘要
The label texts on food package include some information like production date, nutrition facts and production corporation etc. The information provides important foundation for consumers to buy food. It also can help the food supervision and inspection administrations to discover the potential problems of food safety. Food label detection is the groundwork of food label recognition. It can help to decrease the heavy workload of manual inputting and advance efficiency of data processing. The dataset of food label was constructed firstly, and then a semantic segmentation based distance field model (DFM) was proposed. In DFM two tasks were included: pixel classification and distance field regression. The pixel classification task was used to segment the text from background regions, and the distance field regression task was used to predict the normalized distance from the pixel located in the text region to the boundary of text region. For effectively using the correlation of two tasks, an attention module was added into DFM to optimize the model structure. In addition, the loss function was improved to resolve the loss value of the distance field regression as it was too small to train smoothly. The results of ablation experiment showed that the accuracy of the proposed model was increased by 4.39 percentage points and 3.80 percentage points respectively according to the improvement of attention module and loss function. The comparative experiments of different model methods showed that DFM had good performance in detecting the text of food labels, and the recall rate and precision were 87.61% and 76.50%, respectively. © 2020, Chinese Society of Agricultural Machinery. All right reserved.
引用
收藏
页码:336 / 343
页数:7
相关论文
共 29 条
  • [1] ZHANG Suzhi, CHEN Xiaoni, LI Penghui, Et al., Review on food safety big data fusion and classification technology, Computer Technology and Development, 30, 2, pp. 159-165, (2020)
  • [2] YUAN Yun, Analysis on the problems and solutions of food safety supervision, China Food Safety Magazine, 6, pp. 15-16, (2018)
  • [3] EPSHTEIN B, OFEK E, WEXLER Y., Detecting text in natural scenes with stroke width transform, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2963-2970, (2010)
  • [4] MATAS J, CHUM O, URBAN M, Et al., Robust wide-baseline stereo from maximally stable extremal regions, Image and Vision Computing, 22, 10, pp. 761-767, (2004)
  • [5] WANG K, BELONGIE S., Word spotting in the wild, European Conference on Computer Vision, pp. 591-604, (2010)
  • [6] TIAN Shangxuan, PAN Yifeng, HUANG Chang, Et al., Text Flow: a unified text detection system in natural scene images, IEEE International Conference on Computer Vision, pp. 4651-4659, (2015)
  • [7] DALAL N, TRIGGS B., Histograms of oriented gradients for human detection, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886-893, (2005)
  • [8] OJALA T, PIETIKAINEN M, MAENPAA T., Multiresolution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Transactions on Pattern Analysis & Machine Intelligence, 24, 7, pp. 971-987, (2002)
  • [9] WANG Jianxin, WANG Ziya, TIAN Xuan, Review of natural scene text detection and recognition based on deep learning, Journal of Software, 31, 5, pp. 1465-1496, (2020)
  • [10] YAO Cong, BAI Xiang, LIU Wenyu, Et al., Detecting texts of arbitrary orientations in natural images, IEEE Conference on Computer Vision and Pattern Recognition, pp. 1083-1090, (2012)