SemRegex: A Semantics-Based Approach for Generating Regular Expressions from Natural Language Specifications

被引:0
|
作者
Zhong, Zexuan [1 ]
Guo, Jiaqi [2 ]
Yang, Wei [3 ]
Peng, Jian [1 ]
Xie, Tao [1 ]
Lou, Jian-Guang [4 ]
Liu, Ting [2 ]
Zhang, Dongmei [4 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] Xi An Jiao Tong Univ, Xian, Peoples R China
[3] Univ Texas Dallas, Richardson, TX 75083 USA
[4] Microsoft Res Asia, Beijing, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
INFERENCE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent research proposes syntax-based approaches to address the problem of generating programs from natural language specifications. These approaches typically train a sequence-to-sequence learning model using a syntax-based objective: maximum likelihood estimation (MLE). Such syntax-based approaches do not effectively address the goal of generating semantically correct programs, because these approaches fail to handle Program Aliasing, i.e., semantically equivalent programs may have many syntactically different forms. To address this issue, in this paper, we propose a semantics-based approach named SemRegex. SemRegex provides solutions for a subtask of the program-synthesis problem: generating regular expressions from natural language. Different from the existing syntax-based approaches, SemRegex trains the model by maximizing the expected semantic correctness of the generated regular expressions. The semantic correctness is measured using the DFA-equivalence oracle, random test cases, and distinguishing test cases. The experiments on three public datasets demonstrate the superiority of SemRegex over the existing state-of-the-art approaches.
引用
收藏
页码:1608 / 1618
页数:11
相关论文
共 50 条
  • [41] Personal name resolution crossover documents by a semantics-based approach
    Phan, XH
    Nguyen, LM
    Horiguchi, S
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (02): : 825 - 836
  • [42] Professional services automation: A semantics-based approach for knowledge management
    Kashyap, V
    Dalal, S
    Tukey, P
    Behrens, C
    KNOWLEDGE MANAGEMENT & INTELLIGENT ENTERPRISES, 2001, : 10 - 25
  • [43] A semantics-based approach to design of query languages for partial information
    Libkin, L
    SEMANTICS IN DATABASES, 1998, 1358 : 170 - 208
  • [44] A Semantics-Based Approach for Business Categorization on Social Networking Sites
    Memon, Atia Bano
    Zinke, Christian
    Meyer, Kyrill
    COLLABORATION IN A DATA-RICH WORLD, 2017, 506 : 678 - 687
  • [45] A Semantics-Based Hybrid Approach on Binary Code Similarity Comparison
    Hu, Yikun
    Wang, Hui
    Zhang, Yuanyuan
    Li, Bodong
    Gu, Dawu
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2021, 47 (06) : 1241 - 1258
  • [46] AN EXPERIMENTAL NATURAL-LANGUAGE PROCESSOR FOR GENERATING DATA TYPE SPECIFICATIONS
    COMER, JR
    SIGPLAN NOTICES, 1983, 18 (12): : 25 - 33
  • [47] GEOSPATIAL BEHAVIOURAL SEMANTICS: A NATURAL LANGUAGE APPROACH
    Stock, Kristin M.
    2012 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2012, : 2906 - 2909
  • [48] Semantic-Summarizer: Semantics-based text summarizer for English language text
    Mohd, Mudasir
    Nowsheena
    Wani, Mohsin Altaf
    Khanday, Hilal Ahmad
    Mir, Umar Bashir
    Nasrullah, Sheikh
    Maqbool, Zahid
    Wani, Abid Hussain
    SOFTWARE IMPACTS, 2023, 18
  • [49] Semantics-based binary code automated de-obfuscation approach
    Guo J.
    Wang L.
    Tang Z.
    Fang D.
    2016, Huazhong University of Science and Technology (44): : 55 - 59
  • [50] GENERATING PHONEMES FROM WRITTEN THAI USING LEXICAL ANALYSIS BASED ON REGULAR EXPRESSIONS
    van Moergestel, Leo
    Meyer, John-Jules
    ICAART: PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1, 2012, : 306 - 311