ISIA Food-500: A Dataset for Large-Scale Food Recognition via Stacked Global-Local Attention Network

被引:56
|
作者
Min, Weiqing [1 ,2 ]
Liu, Linhu [1 ,2 ]
Wang, Zhiling [1 ,2 ]
Luo, Zhengdong [1 ,2 ]
Wei, Xiaoming [3 ]
Wei, Xiaolin [3 ]
Jiang, Shuqiang [1 ,2 ]
机构
[1] Chinese Acad Sci, Key Lab Intelligent Informat Proc, Inst Comp Technol, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Meituan Dianping Grp, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Food Recognition; Food Datasets; Benchmark; Deep Learning;
D O I
10.1145/3394171.3414031
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Food recognition has received more and more attention in the multimedia community for its various real-world applications, such as diet management and self-service restaurants. A large-scale ontology of food images is urgently needed for developing advanced large-scale food recognition algorithms, as well as for providing the benchmark dataset for such algorithms. To encourage further progress in food recognition, we introduce the dataset ISIA Food-500 with 500 categories from the list in the Wikipedia and 399,726 images, a more comprehensive food dataset that surpasses existing popular benchmark datasets by category coverage and data volume. Furthermore, we propose a stacked global-local attention network, which consists of two sub-networks for food recognition. One sub-network first utilizes hybrid spatial-channel attention to extract more discriminative features, and then aggregates these multi-scale discriminative features from multiple layers into global-level representation (e.g., texture and shape information about food). The other one generates attentional regions (e.g., ingredient relevant regions) from different regions via cascaded spatial transformers, and further aggregates these multi-scale regional features from different layers into local-level representation. These two types of features are finally fused as comprehensive representation for food recognition. Extensive experiments on ISIA Food-500 and other two popular benchmark datasets demonstrate the effectiveness of our proposed method, and thus can be considered as one strong baseline. The dataset, code and models can be found at http://123.57.42.89/FoodComputing-Dataset/ISIA-Food500.html.
引用
收藏
页码:393 / 401
页数:9
相关论文
共 50 条
  • [1] Skin lesion recognition via global-local attention and dual-branch input network
    Tan, Ling
    Wu, Hui
    Xia, Jingming
    Liang, Ying
    Zhu, Jining
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [2] AIFood: A Large Scale Food Images Dataset for Ingredient Recognition
    Lee, Gwo Giun
    Huang, Chin-Wei
    Chen, Jia-Hong
    Chen, Shih-Yu
    Chen, Hsiu-Ling
    PROCEEDINGS OF THE 2019 IEEE REGION 10 CONFERENCE (TENCON 2019): TECHNOLOGY, KNOWLEDGE, AND SOCIETY, 2019, : 802 - 805
  • [3] FoodLogoDet-1500: A Dataset for Large-Scale Food Logo Detection via Multi-Scale Feature Decoupling Network
    Hou, Qiang
    Min, Weiqing
    Wang, Jing
    Hou, Sujuan
    Zheng, Yuanjie
    Jiang, Shuqiang
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4670 - 4679
  • [4] A Global-Local Approximation Framework for Large-Scale Gaussian Process Modeling
    Vakayil, Akhil
    Joseph, V. Roshan
    TECHNOMETRICS, 2024, 66 (02) : 295 - 305
  • [5] Concurrent monitoring of global-local performance indicators for large-scale process
    Yang, Jian
    Song, Bing
    Tan, Shuai
    Shi, Hongbo
    JOURNAL OF THE TAIWAN INSTITUTE OF CHEMICAL ENGINEERS, 2019, 102 : 9 - 16
  • [6] Training Convolutional Neural Network for Sketch Recognition on Large-Scale Dataset
    Zhou, Wen
    Jia, Jinyuan
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2020, 17 (01) : 82 - 89
  • [7] DIC measurement of large-scale objects based on global-local optimization image stitching
    Wang, Linlin
    Li, Zhongyi
    Wang, Zhujun
    Wang, Chuanyun
    Gao, Qian
    Shao, Jing
    Zhang, Tong
    JOURNAL OF OPTICS-INDIA, 2024,
  • [8] Toward large-scale crop production forecasts for global food security
    Badr, G.
    Klein, L. J.
    Freitag, M.
    Albrecht, C. M.
    Marianno, F. J.
    Lu, S.
    Shao, X.
    Hinds, N.
    Hoogenboom, G.
    Hamann, H. F.
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2016, 60 (5-6) : 5 - 6
  • [9] Sign language recognition via dimensional global-local shift and cross-scale aggregation
    Guo, Zihui
    Hou, Yonghong
    Li, Wanqing
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (17): : 12481 - 12493
  • [10] A New Large-scale Food Image Segmentation Dataset and Its Application to Food Calorie Estimation Based on Grains of Rice
    Ege, Takumi
    Shimoda, Wataru
    Yanai, Keiji
    MADIMA'19: PROCEEDINGS OF THE 5TH INTERNATIONAL WORKSHOP ON MULTIMEDIA ASSISTED DIETARY MANAGEMENT, 2019, : 82 - 87