SemRegex: A Semantics-Based Approach for Generating Regular Expressions from Natural Language Specifications

被引:0
|
作者
Zhong, Zexuan [1 ]
Guo, Jiaqi [2 ]
Yang, Wei [3 ]
Peng, Jian [1 ]
Xie, Tao [1 ]
Lou, Jian-Guang [4 ]
Liu, Ting [2 ]
Zhang, Dongmei [4 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] Xi An Jiao Tong Univ, Xian, Peoples R China
[3] Univ Texas Dallas, Richardson, TX 75083 USA
[4] Microsoft Res Asia, Beijing, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
INFERENCE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent research proposes syntax-based approaches to address the problem of generating programs from natural language specifications. These approaches typically train a sequence-to-sequence learning model using a syntax-based objective: maximum likelihood estimation (MLE). Such syntax-based approaches do not effectively address the goal of generating semantically correct programs, because these approaches fail to handle Program Aliasing, i.e., semantically equivalent programs may have many syntactically different forms. To address this issue, in this paper, we propose a semantics-based approach named SemRegex. SemRegex provides solutions for a subtask of the program-synthesis problem: generating regular expressions from natural language. Different from the existing syntax-based approaches, SemRegex trains the model by maximizing the expected semantic correctness of the generated regular expressions. The semantic correctness is measured using the DFA-equivalence oracle, random test cases, and distinguishing test cases. The experiments on three public datasets demonstrate the superiority of SemRegex over the existing state-of-the-art approaches.
引用
收藏
页码:1608 / 1618
页数:11
相关论文
共 50 条
  • [21] Generating Test Cases for Timed Systems from Controlled Natural Language Specifications
    Schnelte, Matthias
    2009 THIRD IEEE INTERNATIONAL CONFERENCE ON SECURE SOFTWARE INTEGRATION AND RELIABILITY IMPROVEMENT, PROCEEDINGS, 2009, : 348 - 353
  • [22] A semantics-based aspect language for interactions with the arbitrary events symbol
    Gronmo, Roy
    Sorensen, Fredrik
    Moller-Pedersen, Birger
    Krogdahl, Stein
    MODEL DRIVEN ARCHITECTURE - FOUNDATIONS AND APPLICATIONS, PROCEEDINGS, 2008, 5095 : 262 - 277
  • [23] Semantics-based Access Control Approach for Web Service
    He, Zhengqiu
    Wu, Lifa
    Li, Huabo
    Lai, Haiguang
    Hong, Zheng
    JOURNAL OF COMPUTERS, 2011, 6 (06) : 1152 - 1161
  • [24] Integrating architecture description languages: A semantics-based approach
    Wang, Q
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, PROCEEDINGS, 2005, 3816 : 434 - 445
  • [25] A Novel Semantics-based Approach to Medical Literature Search
    Yang, Chenhao
    He, Ben
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1616 - 1623
  • [26] Blocking Techniques for Entity Linkage: A Semantics-Based Approach
    Azzalini, Fabio
    Jin, Songle
    Renzi, Marco
    Tanca, Letizia
    DATA SCIENCE AND ENGINEERING, 2021, 6 (01) : 20 - 38
  • [27] Blocking Techniques for Entity Linkage: A Semantics-Based Approach
    Fabio Azzalini
    Songle Jin
    Marco Renzi
    Letizia Tanca
    Data Science and Engineering, 2021, 6 : 20 - 38
  • [28] A Semantics-Based Approach on Binary Function Similarity Detection
    Zhang, Yuntao
    Fang, Binxing
    Xiong, Zehui
    Wang, Yanhao
    Liu, Yuwei
    Zheng, Chao
    Zhang, Qinnan
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (15): : 25910 - 25924
  • [29] An Intelligent Broker Approach to Semantics-based Service Composition
    Zhang, Yufeng
    Zhu, Hong
    2011 35TH IEEE ANNUAL INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), 2011, : 20 - 25
  • [30] A semantics-based approach for collaborative aircraft tooling design
    Li, Yingguang
    Yan, Ruijie
    Jian, Jianbang
    ADVANCED ENGINEERING INFORMATICS, 2010, 24 (02) : 149 - 158