Validation of LLM-Generated Object Co-Occurrence Information for Understanding Three-Dimensional Scenes

被引：0

作者：

Gunji, Kenta ^{[1
]}

Ohno, Kazunori ^{[1
]}

Kurita, Shuhei ^{[2
]}

Sakurada, Ken ^{[3
]}

Bezerra, Ranulfo ^{[1
]}

Kojima, Shotaro ^{[1
]}

Okada, Yoshito ^{[1
]}

Konyo, Masashi ^{[1
]}

Tadokoro, Satoshi ^{[1
]}

机构：

[1] Tohoku Univ, Grad Sch Informat Sci, Sendai 9808579, Japan

[2] Natl Inst Informat, Tokyo 1018430, Japan

[3] Kyoto Univ, Grad Sch Informat, Kyoto 6068501, Japan

来源：

IEEE ACCESS | 2024年 / 12卷

基金：

日本学术振兴会;

关键词：

Robots; Three-dimensional displays; Chatbots; Standards; Knowledge graphs; Semantics; Navigation; Market research; Data models; Data mining; Semantic scene understanding; large language models; co-occurrence validation; prompt engineering;

D O I：

10.1109/ACCESS.2024.3514473

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This study delves into verifying the applicability of object co-occurrence information generated by a large-scale language model (LLM) to enhance a robot's spatial ability to understand objects in the real world. Co-occurrence information is crucial in enabling robots to perceive and navigate their surroundings. LLM can generate object co-occurrence information based on the learned representations acquired from the learning process. However, the challenge lies in determining whether the co-occurrence gleaned from linguistic data can effectively translate to real-world object relationships, a concept yet to be thoroughly examined. After providing category information about a specific situation, this paper compares and evaluates the co-occurrence degree (co-occurrence coefficient) output by gpt-4-turbo-2024-04-09 (GPT-4) against the object pair data from the ScanNet v2 dataset. The results revealed that GPT-4 achieved a high recall of 0.78 across various situation categories annotated by ScanNet v2, although its precision was relatively low at an average of 0.29. The root mean square error of the co-occurrence coefficient was 0.31. While GPT-4 tends to output slightly higher co-occurrence coefficients, it effectively captures the overall co-occurrence patterns observed in the ScanNet v2 dataset. GPT-4 produced co-occurrence information for more objects than those available in ScanNet v2 while covering co-occurrences among objects within ScanNet v2. These results demonstrate that integrating co-occurrence data from different sources could enhance the ability to recognize real-world objects and potentially strengthen robot intelligence.

引用

页码：186573 / 186585

页数：13

共 50 条

[21] Three-dimensional object recognition using phaseonly computer-generated Fresnel hologram
Kumar, Dhirendra
Nishchal, Naveen K.
2015 2ND INTERNATIONAL CONFERENCE ON OPTO-ELECTRONICS AND APPLIED OPTICS (IEM OPTRONIX), 2015,
[22] Three-dimensional information hierarchical encryption based on computer-generated holograms
Kong, Dezhao
Shen, Xueju
Cao, Liangcai
Zhang, Hao
Zong, Song
Jin, Guofan
OPTICS COMMUNICATIONS, 2016, 380 : 387 - 393
[23] Three-dimensional object detection network based on geometric information supplement strategy
Zhou, Jing
Hu, Yiyu
Huang, Xinhan
Wang, Tianjiang
COMPUTERS & ELECTRICAL ENGINEERING, 2023, 106
[24] Three-Dimensional Object Detection Based on Multistage Information Enhancement in Point Clouds
Yuan Shanshuai
Ding Lei
LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (04)
[25] A Novel Abandoned Object Detection System Based on Three-Dimensional Image Information
Zeng, Yiliang
Lan, Jinhui
Ran, Bin
Gao, Jing
Zou, Jinlin
SENSORS, 2015, 15 (03) : 6885 - 6904
[26] Three-Dimensional Object Segmentation and Labeling Algorithm Using Contour and Distance Information
Lo, Wen-Chien
Chiu, Chung-Cheng
Yang, Jia-Horng
APPLIED SCIENCES-BASEL, 2022, 12 (13):
[27] Computer-generated holograms of a real three-dimensional object based on stereoscopic video images
Kim, Seung-Cheol
Hwang, Dong-Choon
Lee, Dong-Hwi
Kim, Eun-Soo
APPLIED OPTICS, 2006, 45 (22) : 5669 - 5676
[28] Three-dimensional feature understanding based on curvature information of FDG-PET
Takamuro, Sota
Tozaki, Tetsuya
Senda, Michio
2018 IEEE NUCLEAR SCIENCE SYMPOSIUM AND MEDICAL IMAGING CONFERENCE PROCEEDINGS (NSS/MIC), 2018,
[29] Analysis of Three Dimensional Textures Through use of Photometric Stereo, Co-occurrence Matrices and Neural Networks
Smith, Lyndon N.
Smith, Melvyn L.
INTERNATIONAL CONFERENCE OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING 2009 (ICCMSE 2009), 2012, 1504 : 1205 - 1209
[30] Maximizing Information from the Renal Biopsy: Computer-Generated Three-Dimensional Constructs
Faulkner-Jones, Beverly E.
Rosen, Devin
Rosen, Seymour
Harrington, Kyle
Law, Charles
LABORATORY INVESTIGATION, 2016, 96 : 405A - 405A

← 1 2 3 4 5 →