Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes

被引:0
|
作者
Gunji, Kenta [1 ]
Ohno, Kazunori [1 ]
Kurita, Shuhei [2 ]
Sakurada, Ken [3 ]
Bezerra, Ranulfo [1 ]
Kojima, Shotaro [1 ]
Okada, Yoshito [1 ]
Konyo, Masashi [1 ]
Tadokoro, Satoshi [1 ]
机构
[1] Tohoku Univ, Grad Sch Informat Sci, Sendai 9808579, Japan
[2] Natl Inst Informat, Tokyo 1018430, Japan
[3] Kyoto Univ, Grad Sch Informat, Kyoto 6068501, Japan
来源
IEEE ACCESS | 2024年 / 12卷
基金
日本学术振兴会;
关键词
Robots; Three-dimensional displays; Chatbots; Standards; Knowledge graphs; Semantics; Navigation; Market research; Data models; Data mining; Semantic scene understanding; large language models; co-occurrence validation; prompt engineering;
D O I
10.1109/ACCESS.2024.3514473
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study delves into verifying the applicability of object co-occurrence information generated by a large-scale language model (LLM) to enhance a robot's spatial ability to understand objects in the real world. Co-occurrence information is crucial in enabling robots to perceive and navigate their surroundings. LLM can generate object co-occurrence information based on the learned representations acquired from the learning process. However, the challenge lies in determining whether the co-occurrence gleaned from linguistic data can effectively translate to real-world object relationships, a concept yet to be thoroughly examined. After providing category information about a specific situation, this paper compares and evaluates the co-occurrence degree (co-occurrence coefficient) output by gpt-4-turbo-2024-04-09 (GPT-4) against the object pair data from the ScanNet v2 dataset. The results revealed that GPT-4 achieved a high recall of 0.78 across various situation categories annotated by ScanNet v2, although its precision was relatively low at an average of 0.29. The root mean square error of the co-occurrence coefficient was 0.31. While GPT-4 tends to output slightly higher co-occurrence coefficients, it effectively captures the overall co-occurrence patterns observed in the ScanNet v2 dataset. GPT-4 produced co-occurrence information for more objects than those available in ScanNet v2 while covering co-occurrences among objects within ScanNet v2. These results demonstrate that integrating co-occurrence data from different sources could enhance the ability to recognize real-world objects and potentially strengthen robot intelligence.
引用
收藏
页码:186573 / 186585
页数:13
相关论文
共 50 条
  • [11] Three-dimensional model-based object recognition and segmentation in cluttered scenes
    Mian, Ajmal S.
    Bennamoun, Mohammed
    Owens, Robyn
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (10) : 1584 - 1601
  • [12] You shall know an object by the company it keeps: An investigation of semantic representations derived from object co-occurrence in visual scenes
    Sadeghi, Zahra
    McClelland, James L.
    Hoffman, Paul
    NEUROPSYCHOLOGIA, 2015, 76 : 52 - 61
  • [13] Reconstruction of three-dimensional heterogeneous media from a single two-dimensional section via co-occurrence correlation function
    Feng, Junxi
    Teng, Qizhi
    He, Xiaohai
    Qing, Linbo
    Li, Yang
    COMPUTATIONAL MATERIALS SCIENCE, 2018, 144 : 181 - 192
  • [14] Texture feature extraction of hyper-spectral image with three-dimensional gray-level co-occurrence
    Wang, Shuang
    Hu, Bingliang
    Wang, Feng
    Journal of Information and Computational Science, 2015, 12 (04): : 1439 - 1448
  • [15] Online object detection and recognition using motion information and local feature co-occurrence
    Zhang, Suofei
    David, Filliat
    Wu, Zhenyang
    Journal of Southeast University (English Edition), 2012, 28 (04) : 404 - 409
  • [16] Interactive information bottleneck for high-dimensional co-occurrence data clustering
    Hu, Shizhe
    Wang, Ruobin
    Ye, Yangdong
    APPLIED SOFT COMPUTING, 2021, 111
  • [17] Acquisition of three-dimensional information of minute object by digital holography
    Zhao, Baoqun
    Qin, Aili
    Wang, Huaying
    CCDC 2009: 21ST CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-6, PROCEEDINGS, 2009, : 4659 - 4662
  • [18] Co-occurrence probability-based pixel pairs background model for robust object detection in dynamic scenes
    Liang, Dong
    Kaneko, Shun'ichi
    Hashimoto, Manabu
    Iwata, Kenji
    Zhao, Xinyue
    PATTERN RECOGNITION, 2015, 48 (04) : 1374 - 1390
  • [19] Analysis of the fringes visibility generated by a lateral cyclic shear interferometer in the retrieval of the three-dimensional surface information of an object
    Sicardi-Segade, Analia
    Martinez-Garcia, Amalia
    Toto-Arellano, Noel-Ivan
    Rayas, J. A.
    OPTIK, 2014, 125 (03): : 1320 - 1324
  • [20] Three-dimensional object recognition using two-dimensional complex amplitude including three-dimensional shape information
    Yoshikawa, N
    Suzuki, Y
    OPTOMECHATRONIC SYSTEMS IV, 2003, 5264 : 66 - 73