A formal language model for parsing SGML

被引:1
|
作者
Matzen, RW [1 ]
George, KM [1 ]
Hedrick, GE [1 ]
机构
[1] OKLAHOMA STATE UNIV,DEPT COMP SCI,STILLWATER,OK 74078
关键词
D O I
10.1016/0164-1212(95)00199-9
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The Standard Generalized Markup Language (SGML) is an international standard for document definition (ISO 8879) that was adopted in 1986 and is rapidly gaining acceptance in industry and government. It is a meta-language system for document design rather than a specific scheme for document processing; almost any kind of document can be described using SGML. Productions called element declarations are used to define arbitrary elements of documents and the context in which they can occur. A finite set of element declarations called a document type definition (DTD) defines the high-level syntax of a set of documents. DTDs are similar to context-free grammars, but the productions are more complex. The standard does not describe a formal language model for SGML, and there is little work in the literature on this topic. This article defines a formal language model for SGML; systems of finite automata from systems of regular expressions. This model is applied in two ways: a parser is constructed for DTDs, and methods are shown for automatically constructing parsers for the documents defined by a DTD. These methods for parsing SGML are new, and they include features of DTDs that have not previously been included in a static language model. The model applies directly to the syntactic constructs of SGML, and thus. the methods shown in this article have distinct advantages for parsing SGML over traditional context-free parsing methods. (C) 1997 by Elsevier Science Inc.
引用
收藏
页码:147 / 166
页数:20
相关论文
共 50 条
  • [1] Formal language model for parsing SGML
    Oklahoma State Univ, Stillwater, United States
    J Syst Software, 2 (147-166):
  • [2] FORMAL PARSING SYSTEMS
    GREIBACH, SA
    COMMUNICATIONS OF THE ACM, 1964, 7 (08) : 499 - 504
  • [3] NeuralParse a neural model for parsing natural language
    Salerno, J
    SMC '97 CONFERENCE PROCEEDINGS - 1997 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: CONFERENCE THEME: COMPUTATIONAL CYBERNETICS AND SIMULATION, 1997, : 2963 - 2968
  • [4] A formal frame for robust parsing
    Vilares, M
    Daniba, VM
    Vilares, J
    Ribadas, FJ
    THEORETICAL COMPUTER SCIENCE, 2004, 328 (1-2) : 171 - 186
  • [5] The SGML character model
    Peterson, D
    SGML '96 CONFERENCE PROCEEDINGS - CELEBRATING A DECADE OF SGML, 1996, : 681 - 685
  • [6] The SGML character model
    Peterson, D
    SGML EUROPE '97 - CONFERENCE PROCEEDINGS, 1997, : 245 - 250
  • [7] A SYNTACTIC LANGUAGE MODEL BASED ON INCREMENTAL CCG PARSING
    Hassan, Hany
    Sima'an, Khalil
    Way, Andy
    2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 205 - +
  • [8] Modal Dependency Parsing via Language Model Priming
    Yao, Jiarui
    Xue, Nianwen
    Min, Bonan
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 2913 - 2919
  • [9] A Language Specification Tool for Model-Based Parsing
    Quesada, Luis
    Berzal, Fernando
    Cubero, Juan-Carlos
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2011, 2011, 6936 : 50 - 57
  • [10] FORMAL MODEL OF THE RUSSIAN LANGUAGE - SYNTAX
    TUZOV, VA
    CYBERNETICS, 1983, 19 (06): : 857 - 866