A novel approach to measuring the scope of patent claims based on probabilities obtained from (large) language models

被引:0
|
作者
Ragot, Sebastien [1 ]
机构
[1] E Blum & Co Ltd, Patent & Trademark Attorneys VSP, Vorderberg 11, CH-8044 Zurich, Switzerland
关键词
Patent scope; Patent value; Patent claims; Language models; Large language models; GPT; Information theory; Self-information;
D O I
10.1016/j.wpi.2024.102321
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
This work proposes to measure the scope of a patent claim as the reciprocal of self-information contained in this claim. Self-information is calculated based on a probability of occurrence of the claim, where this probability is obtained from a language model. Grounded in information theory, this approach is based on the assumption that an unlikely concept is more informative than a usual concept, insofar as it is more surprising. In turn, the more surprising the information required to define the claim, the narrower its scope. Seven language models are considered, ranging from simplest models (each word or character has an identical probability) to intermediate models (based on average word or character frequencies), to large language models (LLMs) such as GPT2 and davinci-002. Remarkably, when using the simplest language models to compute the probabilities, the scope becomes proportional to the reciprocal of the number of words or characters involved in the claim, a metric already used in previous works. Application is made to multiple series of patent claims directed to distinct inventions, where each series consists of a set of claims having a gradually decreasing scope. The performance of the language models is then assessed through several ad hoc tests. The LLMs outperform models based on word and character frequencies, which themselves outdo the simplest models based on word or character counts. Interestingly, however, the character count appears to be a more reliable indicator than the word count.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] A Novel Approach for Machine Reading Comprehension using BERT-based Large Language Models
    Varghese, Nisha
    Shereef, Shafi
    Joy, Helen K.
    Ramasamy, Gobi
    Sridevi, R.
    Cynthia, T.
    Rajeshkanna, R.
    10TH INTERNATIONAL CONFERENCE ON ELECTRONICS, COMPUTING AND COMMUNICATION TECHNOLOGIES, CONECCT 2024, 2024,
  • [2] Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models
    Gao, Changjiang
    Li, Jixing
    Chen, Jiajun
    Huang, Shujian
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 11295 - 11308
  • [3] VoucherGPT: A Novel Approach for Personal Email Voucher Management Using Large Language Models
    Gupta, Sarang
    Jain, Niti
    2024 11TH IEEE SWISS CONFERENCE ON DATA SCIENCE, SDS 2024, 2024, : 167 - 173
  • [4] Enhancing Neural Decoding with Large Language Models: A GPT-Based Approach
    Lee, Dong Hyeok
    Chung, Chun Kee
    2024 12TH INTERNATIONAL WINTER CONFERENCE ON BRAIN-COMPUTER INTERFACE, BCI 2024, 2024,
  • [5] Eliciting Offensive Responses from Large Language Models: A Genetic Algorithm Approach
    Chen, Zheng
    Zhu, Jiachen
    Chen, Anlong
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14864 : 456 - 467
  • [6] Relevance of medical information obtained from Chat- GPT: Are large language models friends or foes?
    Mesnier, Jules
    Suc, Gaspard
    Sayah, Neila
    Abtan, Jeremie
    Steg, Philippe Gabriel
    ARCHIVES OF CARDIOVASCULAR DISEASES, 2023, 116 (10) : 485 - 486
  • [7] A novel approach to unlocking the synergy of large language models and chemical knowledge in biomedical signal applications
    Yin, Zilong
    Wang, Haoyu
    Chen, Bin
    Sun, Hangling
    Li, Anji
    Zhou, Chenyu
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 103
  • [8] Suitability of large language models for extraction of high-quality chemical reaction dataset from patent literature
    Vangala, Sarveswara Rao
    Krishnan, Sowmya Ramaswamy
    Bung, Navneet
    Nandagopal, Dhandapani
    Ramasamy, Gomathi
    Kumar, Satyam
    Sankaran, Sridharan
    Srinivasan, Rajgopal
    Roy, Arijit
    JOURNAL OF CHEMINFORMATICS, 2024, 16 (01):
  • [9] A Software Bug Fixing Approach Based on Knowledge-Enhanced Large Language Models
    Bo, Lili
    He, Yuting
    Sun, Xiaobing
    Ji, Wangjie
    Wu, Xiaohan
    2024 IEEE 24TH INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY, QRS, 2024, : 169 - 179