Semi-Open Set Object Detection Algorithm Leveraged by Multi-Modal Large Language Models

被引:0
|
作者
Wu, Kewei [1 ]
Wang, Yiran [1 ]
He, Xiaogang [1 ]
Yan, Jinyu [2 ]
Guo, Yang [2 ]
Jiang, Zhuqing [1 ]
Zhang, Xing [3 ]
Wang, Wei [3 ]
Xiong, Yongping [1 ]
Men, Aidong [1 ]
Xiao, Li [1 ]
机构
[1] School of Artificial Intelligence, Beijing University of Posts and Telecommunications, 10 Xitucheng Rd, Beijing,100876, China
[2] Beijing Zhuoshizhitong Technology Co., Ltd., Beijing,100096, China
[3] China Resources Digital Co., Ltd., Beijing,518049, China
关键词
D O I
10.3390/bdcc8120175
中图分类号
学科分类号
摘要
引用
收藏
相关论文
共 50 条
  • [31] SPHINX: A Mixer of Weights, Visual Embeddings and Image Scales for Multi-modal Large Language Models
    Lin, Ziyi
    Liu, Dongyang
    Zhang, Renrui
    Gao, Peng
    Qiu, Longtian
    Xiao, Han
    Qiu, Han
    Shao, Wenqi
    Chen, Keqin
    Han, Jiaming
    Huang, Siyuan
    Zhang, Yichi
    He, Xuming
    Qiao, Yu
    Li, Hongsheng
    COMPUTER VISION - ECCV 2024, PT LXII, 2025, 15120 : 36 - 55
  • [32] MMA: Multi-Modal Adapter for Vision-Language Models
    Yang, Lingxiao
    Zhang, Ru-Yuan
    Wang, Yanchen
    Xie, Xiaohua
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 23826 - +
  • [33] Incorporating Concreteness in Multi-Modal Language Models with Curriculum Learning
    Sezerer, Erhan
    Tekir, Selma
    APPLIED SCIENCES-BASEL, 2021, 11 (17):
  • [34] An Interactive Multi-modal Query Answering System with Retrieval-Augmented Large Language Models
    Wang, Mengzhao
    Wu, Haotian
    Ke, Xiangyu
    Gao, Yunjun
    Xu, Xiaoliang
    Chen, Lu
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (12): : 4333 - 4336
  • [35] Scene-adaptive and Region-aware Multi-modal Prompt for Open Vocabulary Object Detection
    Zhao, Xiaowei
    Liu, Xianglong
    Wang, Duorui
    Gao, Yajun
    Liu, Zhide
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 16741 - 16750
  • [36] Open-Set Semi-Supervised Object Detection
    Liu, Yen-Cheng
    Ma, Chih-Yao
    Dai, Xiaoliang
    Tian, Junjiao
    Vajda, Peter
    He, Zijian
    Kira, Zsolt
    COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 : 143 - 159
  • [37] Real-time dense small object detection algorithm based on multi-modal tea shoots
    Shuai, Luyu
    Chen, Ziao
    Li, Zhiyong
    Li, Hongdan
    Zhang, Boda
    Wang, Yuchao
    Mu, Jiong
    FRONTIERS IN PLANT SCIENCE, 2023, 14
  • [38] Multi-task Multi-modal Models for Collective Anomaly Detection
    Ide, Tsuyoshi
    Phan, Dzung T.
    Kalagnanam, Jayant
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 177 - 186
  • [39] Multi-modal object detection and localization for high integrity driving assistance
    Sergio Alberto Rodríguez Flórez
    Vincent Frémont
    Philippe Bonnifait
    Véronique Cherfaoui
    Machine Vision and Applications, 2014, 25 : 583 - 598
  • [40] CrossFormer: Cross-guided attention for multi-modal object detection
    Lee, Seungik
    Park, Jaehyeong
    Park, Jinsun
    PATTERN RECOGNITION LETTERS, 2024, 179 : 144 - 150