Diversity Learning Based on Multi-Latent Space for Medical Image Visual Question Generation

被引:2
|
作者
Zhu, He [1 ]
Togo, Ren [2 ]
Ogawa, Takahiro [2 ]
Haseyama, Miki [2 ]
机构
[1] Hokkaido Univ, Grad Sch Informat Sci & Technol, N 14, W 9, Kita ku, Sapporo, Hokkaido 0600814, Japan
[2] Hokkaido Univ, Fac Informat Sci & Technol, N 14, W 9, Kita ku, Sapporo, Hokkaido 0600814, Japan
关键词
visual question generation; medical image analysis; medical informatics; computer vision; natural language processing;
D O I
10.3390/s23031057
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Auxiliary clinical diagnosis has been researched to solve unevenly and insufficiently distributed clinical resources. However, auxiliary diagnosis is still dominated by human physicians, and how to make intelligent systems more involved in the diagnosis process is gradually becoming a concern. An interactive automated clinical diagnosis with a question-answering system and a question generation system can capture a patient's conditions from multiple perspectives with less physician involvement by asking different questions to drive and guide the diagnosis. This clinical diagnosis process requires diverse information to evaluate a patient from different perspectives to obtain an accurate diagnosis. Recently proposed medical question generation systems have not considered diversity. Thus, we propose a diversity learning-based visual question generation model using a multi-latent space to generate informative question sets from medical images. The proposed method generates various questions by embedding visual and language information in different latent spaces, whose diversity is trained by our newly proposed loss. We have also added control over the categories of generated questions, making the generated questions directional. Furthermore, we use a new metric named similarity to accurately evaluate the proposed model's performance. The experimental results on the Slake and VQA-RAD datasets demonstrate that the proposed method can generate questions with diverse information. Our model works with an answering model for interactive automated clinical diagnosis and generates datasets to replace the process of annotation that incurs huge labor costs.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] A Generalized Hierarchical Multi-Latent Space Model for Heterogeneous Learning
    Yang, Pei
    Davulcu, Hasan
    Zhu, Yada
    He, Jingrui
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (12) : 3154 - 3168
  • [2] Model Multiple Heterogeneity via Hierarchical Multi-Latent Space Learning
    Yang, Pei
    He, Jingrui
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 1375 - 1384
  • [3] DW-MLSR: Unsupervised Deformable Medical Image Registration Based on Dual-Window Attention and Multi-Latent Space
    Huang, Yuxuan
    Yin, Mengxiao
    Li, Zhipan
    Yang, Feng
    ELECTRONICS, 2024, 13 (24):
  • [4] Image Search Reranking with Multi-latent Topical Graph
    Shen, Junge
    Mei, Tao
    Tian, Qi
    Gao, Xinbo
    2013 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2013, : 1 - 4
  • [5] MMQL: Multi-Question Learning for Medical Visual Question Answering
    Chen, Qishen
    Bian, Minjie
    Xu, Huahu
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2024, PT V, 2024, 15005 : 480 - 489
  • [6] Infrared Image Generation Based on Visual State Space and Contrastive Learning
    Li, Bing
    Ma, Decao
    He, Fang
    Zhang, Zhili
    Zhang, Daqiao
    Li, Shaopeng
    REMOTE SENSING, 2024, 16 (20)
  • [7] Learning consensus representations in multi-latent spaces for multi-view clustering
    Ma, Qianli
    Zheng, Jiawei
    Li, Sen
    Zheng, Zhenjing
    Cottrell, Garrison W.
    NEUROCOMPUTING, 2024, 596
  • [8] Unsupervised Multi-Latent Space RL Framework for Video Summarization in Ultrasound Imaging
    Mathews, Roshan P.
    Panicker, Mahesh Raveendranatha
    Hareendranathan, Abhilash R.
    Chen, Yale Tung
    Jaremko, Jacob L.
    Buchanan, Brian
    Narayan, Kiran Vishnu
    Kesavadas, C.
    Mathews, Greeta
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (01) : 227 - 238
  • [9] Medical Image Generation based on Latent Diffusion Models
    Song, Wenbo
    Jiang, Yan
    Fang, Yin
    Cao, Xinyu
    Wu, Peiyan
    Xing, Hanshuo
    Wu, Xinglong
    2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE INNOVATION, ICAII 2023, 2023, : 89 - 93
  • [10] vid-SAMGRAH: A PyTorch framework for multi-latent space reinforcement learning driven video summarization in ultrasound imaging
    Mathews, Roshan P.
    Panicker, Mahesh Raveendranatha
    Hareendranathan, Abhilash R.
    SOFTWARE IMPACTS, 2021, 10