A Comprehensive Evaluation of Neural SPARQL Query Generation From Natural Language Questions

被引:0
|
作者
Diallo, Papa Abdou Karim Karou [1 ]
Reyd, Samuel [2 ]
Zouaq, Amal [1 ]
机构
[1] Polytech Montreal, Dept Comp Engn & Software Engn, LAMA WeST Lab, Montreal, PQ H3T 1J4, Canada
[2] Telecom Paris, F-91120 Palaiseau, France
来源
IEEE ACCESS | 2024年 / 12卷
基金
加拿大自然科学与工程研究理事会;
关键词
Annotations; Large language models; Computer architecture; Transformers; Vocabulary; Query processing; Knowledge based systems; Encoding; SPARQL query generation; knowledge base; copy mechanism; non pre-trained; pre-trained encoders-decoders;
D O I
10.1109/ACCESS.2024.3453215
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, the field of neural machine translation (NMT) for SPARQL query generation has witnessed significant growth. Incorporating the copy mechanism with traditional encoder-decoder architectures and using pre-trained encoder-decoder and large language models have set new performance benchmarks. This paper presents various experiments that replicate and expand upon recent NMT-based SPARQL generation studies, comparing pre-trained language models (PLMs), non-pre-trained language models (NPLMs), and large language models (LLMs), highlighting the impact of question annotation and the copy mechanism and testing various fine-tuning methods using LLMs. In particular, we provide a systematic error analysis of the models and test their generalization ability. Our study demonstrates that the copy mechanism yields significant performance enhancements for most PLMs and NPLMs. Annotating the data is pivotal to generating correct URIs, with the "tag-within" strategy emerging as the most effective approach. Additionally, our findings reveal that the primary source of errors stems from incorrect URIs in SPARQL queries that are sometimes replaced with hallucinated URIs when using base models. This does not happen using the copy mechanism, but it sometimes leads to selecting wrong URIs among candidates. Finally, the performance of the tested LLMs fell short of achieving the desired outcomes.
引用
收藏
页码:125057 / 125078
页数:22
相关论文
共 50 条
  • [1] SGPT: A Generative Approach for SPARQL Query Generation From Natural Language Questions
    Rony, Md Rashad Al Hasan
    Kumar, Uttam
    Teucher, Roman
    Kovriguina, Liubov
    Lehmann, Jens
    IEEE ACCESS, 2022, 10 : 70712 - 70723
  • [2] Semantic query graph based SPARQL generation from natural language questions
    Shengli Song
    Wen Huang
    Yulong Sun
    Cluster Computing, 2019, 22 : 847 - 858
  • [3] Semantic query graph based SPARQL generation from natural language questions
    Song, Shengli
    Huang, Wen
    Sun, Yulong
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 1): : 847 - 858
  • [4] Automated conversion from natural language query to SPARQL query
    Jung, Haemin
    Kim, Wooju
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2020, 55 (03) : 501 - 520
  • [5] Automated conversion from natural language query to SPARQL query
    Haemin Jung
    Wooju Kim
    Journal of Intelligent Information Systems, 2020, 55 : 501 - 520
  • [6] Intelligent SPARQL Query Generation for Natural Language Processing Systems
    Chen, Yi-Hui
    Lu, Eric Jui-Lin
    Ou, Ting-An
    IEEE ACCESS, 2021, 9 : 158638 - 158650
  • [7] Towards an automatic SPARQL query generation from ontology competency questions
    Benhocine K.
    Hansali A.
    Zemmouchi-Ghomari L.
    Ghomari A.R.
    International Journal of Computers and Applications, 2022, 44 (10) : 971 - 980
  • [8] Neural machine translating from natural language to SPARQL
    Yin, Xiaoyu
    Gromann, Dagmar
    Rudolph, Sebastian
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 117 : 510 - 519
  • [9] AskNow: A Framework for Natural Language Query Formalization in SPARQL
    Dubey, Mohnish
    Dasgupta, Sourish
    Sharma, Ankit
    Hoeffner, Konrad
    Lehmann, Jens
    SEMANTIC WEB: LATEST ADVANCES AND NEW DOMAINS, 2016, 9678 : 300 - 316
  • [10] A Method of Natural Language Understanding on SPARQL Ontology Query
    Yang, Tianqi
    Zhang, Zongren
    2011 9TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA 2011), 2011, : 394 - 397