Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes

被引:0
|
作者
Gunji, Kenta [1 ]
Ohno, Kazunori [1 ]
Kurita, Shuhei [2 ]
Sakurada, Ken [3 ]
Bezerra, Ranulfo [1 ]
Kojima, Shotaro [1 ]
Okada, Yoshito [1 ]
Konyo, Masashi [1 ]
Tadokoro, Satoshi [1 ]
机构
[1] Tohoku Univ, Grad Sch Informat Sci, Sendai 9808579, Japan
[2] Natl Inst Informat, Tokyo 1018430, Japan
[3] Kyoto Univ, Grad Sch Informat, Kyoto 6068501, Japan
来源
IEEE ACCESS | 2024年 / 12卷
基金
日本学术振兴会;
关键词
Robots; Three-dimensional displays; Chatbots; Standards; Knowledge graphs; Semantics; Navigation; Market research; Data models; Data mining; Semantic scene understanding; large language models; co-occurrence validation; prompt engineering;
D O I
10.1109/ACCESS.2024.3514473
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study delves into verifying the applicability of object co-occurrence information generated by a large-scale language model (LLM) to enhance a robot's spatial ability to understand objects in the real world. Co-occurrence information is crucial in enabling robots to perceive and navigate their surroundings. LLM can generate object co-occurrence information based on the learned representations acquired from the learning process. However, the challenge lies in determining whether the co-occurrence gleaned from linguistic data can effectively translate to real-world object relationships, a concept yet to be thoroughly examined. After providing category information about a specific situation, this paper compares and evaluates the co-occurrence degree (co-occurrence coefficient) output by gpt-4-turbo-2024-04-09 (GPT-4) against the object pair data from the ScanNet v2 dataset. The results revealed that GPT-4 achieved a high recall of 0.78 across various situation categories annotated by ScanNet v2, although its precision was relatively low at an average of 0.29. The root mean square error of the co-occurrence coefficient was 0.31. While GPT-4 tends to output slightly higher co-occurrence coefficients, it effectively captures the overall co-occurrence patterns observed in the ScanNet v2 dataset. GPT-4 produced co-occurrence information for more objects than those available in ScanNet v2 while covering co-occurrences among objects within ScanNet v2. These results demonstrate that integrating co-occurrence data from different sources could enhance the ability to recognize real-world objects and potentially strengthen robot intelligence.
引用
收藏
页码:186573 / 186585
页数:13
相关论文
共 50 条
  • [1] Contextual co-occurrence information for object representation and categorization
    Sheikhbahaei, Soheila
    Sadeghi, Zahra
    International Journal of Database Theory and Application, 2015, 8 (01): : 95 - 104
  • [2] An Object Co-occurrence Assisted Hierarchical Model for Scene Understanding
    Li, Xin
    Guo, Yuhong
    PROCEEDINGS OF THE BRITISH MACHINE VISION CONFERENCE 2012, 2012,
  • [3] Three-dimensional temperature uniformity assessment based on gray level co-occurrence matrix
    Yang, Deng Wen
    Wu, Hong
    APPLIED THERMAL ENGINEERING, 2016, 108 : 689 - 696
  • [4] Visualization of the occurrence and spread of wildfires in three-dimensional natural scenes
    Meng, Qingkuo
    Huai, Yongjian
    Ma, Fei
    Ye, Wentao
    Xu, Haifeng
    Yang, Siyu
    VISUAL COMPUTER, 2025, 41 (02): : 1213 - 1226
  • [5] A Confidence Ranked Co-Occurrence Approach for Accurate Object Recognition in Highly Complex Scenes
    Angin, Pelin
    Bhargava, Bharat
    JOURNAL OF INTERNET TECHNOLOGY, 2013, 14 (01): : 13 - 19
  • [6] Surface matching for object recognition in complex three-dimensional scenes
    Johnson, AE
    Hebert, M
    IMAGE AND VISION COMPUTING, 1998, 16 (9-10) : 635 - 651
  • [7] Acceleration of color computer-generated hologram from three-dimensional scenes with texture and depth information
    Shimobaba, Tomoyoshi
    Kakue, Takashi
    Ito, Tomoyoshi
    THREE-DIMENSIONAL IMAGING, VISUALIZATION, AND DISPLAY 2014, 2014, 9117
  • [8] Recent progress in computer-generated holography for three-dimensional scenes
    Park J.-H.
    Journal of Information Display, 2017, 18 (01) : 1 - 12
  • [9] Urban network of China from the perspective of population mobility: Three-dimensional co-occurrence of nodes and links
    Luo, Xinyue
    Chen, Mingxing
    ENVIRONMENT AND PLANNING A-ECONOMY AND SPACE, 2021, 53 (05): : 887 - 889
  • [10] Multiple Phase Three-dimensional Scenes Encryption Using Computer Generated Holograms
    Kishk, Sherif
    El-Nagger, Tamer H.
    Samra, Ahmed S.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2010, 10 (04): : 58 - 65