Multi-Timestep-Ahead Prediction with Mixture of Experts for Embodied Question Answering

被引:1
|
作者
Suzuki, Kanata [1 ,2 ]
Kamiwano, Yuya [1 ]
Chiba, Naoya [1 ,3 ]
Mori, Hiroki [1 ]
Ogata, Tetsuya [1 ,4 ,5 ]
机构
[1] Waseda Univ, Fac Sci & Engn, Tokyo, Japan
[2] Fujitsu Ltd, Artificial Intelligence Labs, Minato, Kanagawa, Japan
[3] OMRON SINIC X Corp, Tokyo, Japan
[4] Waseda Univ, Waseda Res Inst Sci & Engn WISE, Tokyo, Japan
[5] Natl Inst Adv Ind Sci & Technol, Tokyo, Japan
关键词
Embodied Question Answering; Mixture of Experts; Multi-step Ahead Prediction;
D O I
10.1007/978-3-031-44223-0_20
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we propose a method that integrates visual field predictions with different time scales and investigates its effectiveness for embodied question answering (EQA). In EQA, it is desirable to be able to automatically select a prediction time scale according to the situation, as the path to the target object depends on the instructions provided. However, previous studies have only investigated subtask learning with a limited prediction timescale and target. We propose a mixed expert model in which multiple expert networks predict future images at different time steps, and a higher-level gating network estimates the distribution of each experts output. By sequentially adjusting the output of the expert network, the proposed method enables robot navigation considering multi-timestep-ahead prediction. Comparison experiments on the EQA MP3D dataset show that the proposed method improves the prediction accuracy of the model regardless of the distance to the target.
引用
收藏
页码:243 / 255
页数:13
相关论文
共 16 条
  • [1] Multi-Target Embodied Question Answering
    Yu, Licheng
    Chen, Xinlei
    Gkioxari, Georgia
    Bansal, Mohit
    Berg, Tamara L.
    Batra, Dhruv
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 6302 - 6311
  • [2] Single-dataset Experts for Multi-dataset Question Answering
    Friedman, Dan
    Dodge, Ben
    Chen, Danqi
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6128 - 6137
  • [3] Unified Transformer with Cross-Modal Mixture Experts for Remote-Sensing Visual Question Answering
    Liu, Gang
    He, Jinlong
    Li, Pengfei
    Zhong, Shenjun
    Li, Hongyang
    He, Genrong
    REMOTE SENSING, 2023, 15 (19)
  • [4] Modeling Multi-hop Question Answering as Single Sequence Prediction
    Yavuz, Semih
    Hashimoto, Kazuma
    Zhou, Yingbo
    Keskar, Nitish Shirish
    Xiong, Caiming
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 974 - 990
  • [5] CooKie: commonsense knowledge-guided mixture-of-experts framework for fine-grained visual question answering
    Wang, Chao
    Yang, Jianming
    Zhou, Yang
    Yue, Xiaodong
    INFORMATION SCIENCES, 2025, 695
  • [6] Improved multi-gate mixture-of-experts framework for multi-step prediction of gas load
    Tong, Jianfeng
    Liu, Zhenxing
    Zhang, Yong
    Zheng, Xiujuan
    Jin, Junyang
    ENERGY, 2023, 282
  • [7] Stepwise relation prediction with dynamic reasoning network for multi-hop knowledge graph question answering
    Cui, Hai
    Peng, Tao
    Bao, Tie
    Han, Ridong
    Han, Jiayu
    Liu, Lu
    APPLIED INTELLIGENCE, 2023, 53 (10) : 12340 - 12354
  • [8] Stepwise relation prediction with dynamic reasoning network for multi-hop knowledge graph question answering
    Hai Cui
    Tao Peng
    Tie Bao
    Ridong Han
    Jiayu Han
    Lu Liu
    Applied Intelligence, 2023, 53 : 12340 - 12354
  • [9] BeamQA: Multi-hop Knowledge Graph Question Answering with Sequence-to-Sequence Prediction and Beam Search
    Atif, Farah
    El Khatib, Ola
    Difallah, Djellel
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 781 - 790
  • [10] A Temporal Multi-Gate Mixture-of-Experts Approach for Vehicle Trajectory and Driving Intention Prediction
    Yuan, Renteng
    Abdel-Aty, Mohamed
    Xiang, Qiaojun
    Wang, Zijin
    Gu, Xin
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2024, 9 (01): : 1204 - 1216