SemRegex: A Semantics-Based Approach for Generating Regular Expressions from Natural Language Specifications

被引:0
|
作者
Zhong, Zexuan [1 ]
Guo, Jiaqi [2 ]
Yang, Wei [3 ]
Peng, Jian [1 ]
Xie, Tao [1 ]
Lou, Jian-Guang [4 ]
Liu, Ting [2 ]
Zhang, Dongmei [4 ]
机构
[1] Univ Illinois, Urbana, IL 61801 USA
[2] Xi An Jiao Tong Univ, Xian, Peoples R China
[3] Univ Texas Dallas, Richardson, TX 75083 USA
[4] Microsoft Res Asia, Beijing, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
INFERENCE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent research proposes syntax-based approaches to address the problem of generating programs from natural language specifications. These approaches typically train a sequence-to-sequence learning model using a syntax-based objective: maximum likelihood estimation (MLE). Such syntax-based approaches do not effectively address the goal of generating semantically correct programs, because these approaches fail to handle Program Aliasing, i.e., semantically equivalent programs may have many syntactically different forms. To address this issue, in this paper, we propose a semantics-based approach named SemRegex. SemRegex provides solutions for a subtask of the program-synthesis problem: generating regular expressions from natural language. Different from the existing syntax-based approaches, SemRegex trains the model by maximizing the expected semantic correctness of the generated regular expressions. The semantic correctness is measured using the DFA-equivalence oracle, random test cases, and distinguishing test cases. The experiments on three public datasets demonstrate the superiority of SemRegex over the existing state-of-the-art approaches.
引用
收藏
页码:1608 / 1618
页数:11
相关论文
共 50 条
  • [31] Towards a semantics-based approach in the development of geographic portals
    Athanasis, Nikolaos
    Kalabokidis, Kostas
    Vaitis, Michail
    Soulakellis, Nikolaos
    COMPUTERS & GEOSCIENCES, 2009, 35 (02) : 301 - 308
  • [32] Frame Semantics-based Approach to Spanish Textual Categorization
    Crespo Miguel, Mario
    Frias Delgado, Antonio
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2008, (41): : 65 - 71
  • [33] A Semantics-Based Approach to Concept Assignment in Assembly Code
    Sisco, Zachary
    Bryant, Adam
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON CYBER WARFARE AND SECURITY (ICCWS 2017), 2017, : 341 - 351
  • [34] A SEMANTICS-BASED APPROACH FOR THE DESIGN AND IMPLEMENTATION OF INTERACTION OBJECTS
    PATERNO, F
    LEONARDI, A
    COMPUTER GRAPHICS FORUM, 1994, 13 (03) : C195 - C204
  • [35] Language issues in generating simulations from specifications
    Toal, RJ
    Dickson, CL
    PROCEEDINGS OF THE FOURTH IASTED INTERNATIONAL CONFERENCE ON MODELLING, SIMULATION, AND OPTIMIZATION, 2004, : 281 - 286
  • [36] Towards a Trustworthy Semantics-Based Language Framework via Proof Generation
    Chen, Xiaohong
    Lin, Zhengyao
    Minh-Thai Trinh
    Rosu, Grigore
    COMPUTER AIDED VERIFICATION, PT II, CAV 2021, 2021, 12760 : 477 - 499
  • [37] Multimedia context interpretation: a semantics-based cooperative indexing approach
    Maree, Mohammed
    NEW REVIEW OF HYPERMEDIA AND MULTIMEDIA, 2020, 26 (1-2) : 24 - 54
  • [38] Automatic simplification of obfuscated JavaScript code: A semantics-based approach
    Department of Computer Science, University of Arizona, Tucson, AZ 85721, United States
    Proc. IEEE Int. Conf. Softw. Secur. Reliab., SERE, (31-40):
  • [39] Semantics-based video indexing using a stochastic modeling approach
    Wei, Yong
    Bhandarkar, Suchendra M.
    Li, Kang
    2007 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-7, 2007, : 2009 - 2012
  • [40] GeoCosm: A semantics-based approach for information integration of geospatial data
    Ram, S
    Khatri, V
    Zhang, LM
    Zeng, DD
    CONCEPTUAL MODELING FOR NEW INFORMATION SYSTEMS TECHNOLOGIES, 2002, 2465 : 152 - 165