Complex Object Classification: A Multi-Modal Multi-Instance Multi-Label Deep Network with Optimal Transport

被引:36
|
作者
Yang, Yang [1 ]
Wu, Yi-Feng [1 ]
Zhan, De-Chuan [1 ]
Liu, Zhi-Bin [2 ]
Jiang, Yuan [1 ]
机构
[1] Nanjing Univ, Natl Key Lab Novel Software Technol, Nanjing, Jiangsu, Peoples R China
[2] Tencent WXG, Shenzhen, Peoples R China
基金
国家重点研发计划;
关键词
Multi-modal; Multi-instance; Multi-label; Optimal Transport;
D O I
10.1145/3219819.3220012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In real world applications, complex objects are usually with multiple labels, and can be represented as multiple modal representations, e.g., the complex articles contain text and image information as well as are with multiple annotations. Previous methods assume that the homogeneous multi-modal data are consistent, while in real applications, the raw data are disordered, i.e., the article is constituted with variable number of inconsistent text and image instances. To solve this problem, Multi-modal Multi-instance Multi-label (M3) learning provides a framework for handling such task and has exhibited excellent performance. Besides, how to effectively utilize label correlation is also a challenging issue. In this paper, we propose a novel Multi-modal Multi-instance Multi-label Deep Network (M3DN), which learns the label prediction and exploits label correlation simultaneously based on the Optimal Transport, by considering the consistency principle between different modal bag-level prediction and the learned latent ground label metric. Experiments on benchmark datasets and real world WKG Game-Hub dataset validate the effectiveness of the proposed method.
引用
收藏
页码:2594 / 2603
页数:10
相关论文
共 50 条
  • [31] A multi-instance multi-label learning algorithm based on instance correlations
    Chanjuan Liu
    Tongtong Chen
    Xinmiao Ding
    Hailin Zou
    Yan Tong
    Multimedia Tools and Applications, 2016, 75 : 12263 - 12284
  • [32] Dynamic Programming for Instance Annotation in Multi-Instance Multi-Label Learning
    Pham, Anh T.
    Raich, Raviv
    Fern, Xiaoli Z.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (12) : 2381 - 2394
  • [33] SIMULTANEOUS INSTANCE ANNOTATION AND CLUSTERING IN MULTI-INSTANCE MULTI-LABEL LEARNING
    Pham, Anh T.
    Raich, Raviv
    Fern, Xiaoli Z.
    2015 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2015,
  • [34] A Multi-Instance Multi-Label Scene Classification Method based on Multi-Kernel Fusion
    Chen Tong-tong
    Liu Chan-juan
    Zou Hai-lin
    Zhou Shu-sen
    Liu Ying
    Ding Xin-miao
    2015 SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS), 2015, : 782 - 787
  • [35] Multi-Instance Multi-Label Learning For Automatic Tag Recommendation
    Shen, Chen
    Jiao, Jun
    Yang, Yahui
    Wang, Bin
    2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, : 4910 - +
  • [36] Multi-instance multi-label learning for surgical image annotation
    Loukas, Constantinos
    Sgouros, Nicholas P.
    INTERNATIONAL JOURNAL OF MEDICAL ROBOTICS AND COMPUTER ASSISTED SURGERY, 2020, 16 (02):
  • [37] NOVELTY DETECTION UNDER MULTI-LABEL MULTI-INSTANCE FRAMEWORK
    Lou, Qi
    Raich, Raviv
    Briggs, Forrest
    Fern, Xiaoli Z.
    2013 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2013,
  • [38] Acoustic classification of multiple simultaneous bird species: A multi-instance multi-label approach
    Briggs, Forrest
    Lakshminarayanan, Balaji
    Neal, Lawrence
    Fern, Xiaoli Z.
    Raich, Raviv
    Hadley, Sarah J. K.
    Hadley, Adam S.
    Betts, Matthew G.
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2012, 131 (06): : 4640 - 4650
  • [39] KERNEL-BASED INSTANCE ANNOTATION IN MULTI-INSTANCE MULTI-LABEL LEARNING
    Pham, Anh T.
    Raich, Raviv
    2014 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2014,
  • [40] Multi-Modal Multi-Instance Learning for Retinal Disease Recognition
    Li, Xirong
    Zhou, Yang
    Wang, Jie
    Lin, Hailan
    Zhao, Jianchun
    Ding, Dayong
    Yu, Weihong
    Chen, Youxin
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2474 - 2482