Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes

被引:0
|
作者
Gunji, Kenta [1 ]
Ohno, Kazunori [1 ]
Kurita, Shuhei [2 ]
Sakurada, Ken [3 ]
Bezerra, Ranulfo [1 ]
Kojima, Shotaro [1 ]
Okada, Yoshito [1 ]
Konyo, Masashi [1 ]
Tadokoro, Satoshi [1 ]
机构
[1] Tohoku Univ, Grad Sch Informat Sci, Sendai 9808579, Japan
[2] Natl Inst Informat, Tokyo 1018430, Japan
[3] Kyoto Univ, Grad Sch Informat, Kyoto 6068501, Japan
来源
IEEE ACCESS | 2024年 / 12卷
基金
日本学术振兴会;
关键词
Robots; Three-dimensional displays; Chatbots; Standards; Knowledge graphs; Semantics; Navigation; Market research; Data models; Data mining; Semantic scene understanding; large language models; co-occurrence validation; prompt engineering;
D O I
10.1109/ACCESS.2024.3514473
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This study delves into verifying the applicability of object co-occurrence information generated by a large-scale language model (LLM) to enhance a robot's spatial ability to understand objects in the real world. Co-occurrence information is crucial in enabling robots to perceive and navigate their surroundings. LLM can generate object co-occurrence information based on the learned representations acquired from the learning process. However, the challenge lies in determining whether the co-occurrence gleaned from linguistic data can effectively translate to real-world object relationships, a concept yet to be thoroughly examined. After providing category information about a specific situation, this paper compares and evaluates the co-occurrence degree (co-occurrence coefficient) output by gpt-4-turbo-2024-04-09 (GPT-4) against the object pair data from the ScanNet v2 dataset. The results revealed that GPT-4 achieved a high recall of 0.78 across various situation categories annotated by ScanNet v2, although its precision was relatively low at an average of 0.29. The root mean square error of the co-occurrence coefficient was 0.31. While GPT-4 tends to output slightly higher co-occurrence coefficients, it effectively captures the overall co-occurrence patterns observed in the ScanNet v2 dataset. GPT-4 produced co-occurrence information for more objects than those available in ScanNet v2 while covering co-occurrences among objects within ScanNet v2. These results demonstrate that integrating co-occurrence data from different sources could enhance the ability to recognize real-world objects and potentially strengthen robot intelligence.
引用
收藏
页码:186573 / 186585
页数:13
相关论文
共 50 条
  • [31] Maximizing Information from the Renal Biopsy: Computer-Generated Three-Dimensional Constructs
    Faulkner-Jones, Beverly E.
    Rosen, Devin
    Rosen, Seymour
    Harrington, Kyle
    Law, Charles
    MODERN PATHOLOGY, 2016, 29 : 405A - 405A
  • [32] Adaptive temporal fusion network with depth supervision and modulation for robust three-dimensional object detection in complex scenes
    Liu, Yifan
    Zhang, Yong
    Lan, Rukai
    Cui, Xiaopeng
    Xie, Linbo
    Wu, Zhaolong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144
  • [33] Three-dimensional coupled-object segmentation using symmetry and tissue type information
    Bijari, Payam B.
    Akhondi-Asl, Alireza
    Soltanian-Zadeh, Hamid
    COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, 2010, 34 (03) : 236 - 249
  • [34] Hybrid holographic microscopy: visualization of three-dimensional object information by use of viewing angles
    Takaki, Y
    Ohzu, H
    APPLIED OPTICS, 2000, 39 (29) : 5302 - 5308
  • [35] Three-dimensional object recognition using multiplex complex amplitude information with support function
    Yoshikawa, Nobukazu
    Ii, Yukie
    ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 1, PROCEEDINGS, 2006, : 314 - +
  • [36] Occurrence of extreme waves in three-dimensional mechanically generated wave fields propagating over an oblique current
    Toffoli, A.
    Cavaleri, L.
    Babanin, A. V.
    Benoit, M.
    Bitner-Gregersen, E. M.
    Monbaliu, J.
    Onorato, M.
    Osborne, A. R.
    Stansberg, C. T.
    NATURAL HAZARDS AND EARTH SYSTEM SCIENCES, 2011, 11 (03) : 895 - 903
  • [37] Three-Dimensional Object Co-Localization From Mobile LiDAR Point Clouds
    Guo, Wenzhong
    Chen, Jiawei
    Wang, Weipeng
    Luo, Huan
    Wang, Shiping
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (04) : 1996 - 2007
  • [38] Watermark recovery from two-dimensional views of a three-dimensional object using texture or silhouette information
    Bennour, Ahane
    Garcia, Emmanuel
    Dugelay, Jean-Luc
    JOURNAL OF ELECTRONIC IMAGING, 2006, 15 (04)
  • [39] Evaluating three-dimensional localisation information generated by bio-inspired in-air sonar
    Schillebeeckx, F.
    Vanderelst, D.
    Reijniers, J.
    Peremans, H.
    IET RADAR SONAR AND NAVIGATION, 2012, 6 (06): : 516 - 525
  • [40] An object-oriented data model for complex objects in three-dimensional geographical information systems
    Shi, WZ
    Yang, BS
    Li, QQ
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2003, 17 (05) : 411 - 430