This is grammar-fw.info, produced by makeinfo version 4.3 from grammar-fw.texi. This manual documents Grammar Development with Semantic. Copyright (C) 2004 Eric M. Ludlam Copyright (C) 2004 David Ponce Copyright (C) 2004 Richard Y. Kim Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with the Invariant Sections being list their titles, with the Front-Cover Texts being list, and with the Back-Cover Texts being list. A copy of the license is included in the section entitled "GNU Free Documentation License". INFO-DIR-SECTION Emacs START-INFO-DIR-ENTRY * Semantic Grammar Framework: (grammar-fw). END-INFO-DIR-ENTRY This file documents Grammar Development with Semantic. Copyright (C) 2004 Eric M. Ludlam, David Ponce, and Richard Y. Kim  File: grammar-fw.info, Node: Top, Next: Overview, Up: (dir) Semantic Grammar Framework Manual ********************************* The Semantic Grammar Framework provides a consistent way to write rule based grammars for use in Emacs. This document describes how to use the grammar writing environment, and how to write in the rule based language. * Menu: * Overview:: * Grammar File:: * Working with grammars:: * Adding a new grammar mode:: * GNU Free Documentation License:: * Index::  File: grammar-fw.info, Node: Overview, Next: Grammar File, Prev: Top, Up: Top Overview ******** semantic version 2 introduced a new grammar framework to provide a clean and consistent view into parser writing. All grammars are specified in a common rule based language derived from the input grammar language used by the GNU parser generator Bison. The main differences are that: - The C-like syntax is replaced by Emacs Lisp syntax. - Percent declarations are specific to semantic. Nevertheless, for those who are a familiar with Bison grammar syntax, and have some knowledge of Emacs Lisp, writing semantic grammars won't hard. Moreover, the grammar framework provides Emacs goodies (indentation, syntax coloring, etc.) to help edit grammars. Grammars written in the common rule based language must be translated into Emacs Lisp code so a semantic parser can use it. The framework defines a reusable and flexible API that simplifies the implementation of grammar-to-lisp translators. The _abstract_ major mode `semantic-grammar-mode' provides the core functionalities to edit and translate input grammars. _Concrete_ grammar modes, derived from `semantic-grammar-mode', implement the common API that translates the input grammar format into Emacs Lisp code understandable by a particular parser. An unique file extension associates input grammars to a _concrete_ grammar mode, that is to a particular translator and parser. semantic provides these _concrete_ grammar modes: - `bovine-grammar-mode', *Note BY grammars::. - `wisent-grammar-mode', *Note WY grammars::. The following diagram presents a global view of the framework: `Bovine' parser ! `Wisent' parser ! Other parser ! ! /! + - - - - - - - - -!- - - - - - - - - -!- - - - - - - - - + || ! Common Semantic Grammar Mode ! || ! ,,,,,,,,,,,,,,,, ! ,,,,,,,,,,,,,,,, ! ,,,,,,,,,,,,,,,, ! Grammar ! ; `BY' Grammar ; ! ; `WY' Grammar ; ! ; `??' Grammar ; ! Framework! ; ; ! ; ; ! ; ; ! || ! ; Elisp Gen. ; ! ; Elisp Gen. ; ! ; Elisp Gen. ; ! || ! '''''''!'''''''' ! '''''''!'''''''' ! '''''''!'''''''' ! !/ + - - - -|- - - - -!- - - - | - - - - -!- - - - | - - - - + | ! | ! | /! ,-----V------, ! ,-----V------, ! ,-----V------, || . Language . ! . Language . ! . Language . || ` Support Lib. ' ! ` Support Lib. ' ! ` Support Lib. ' || `--!------!--' ! `--!------!--' ! `--!------!--' || | | ! | | ! | | || +---V---+ | ! +---V---+ | ! +---V---+ | || | Lexer | | ! | Lexer | | ! | Lexer | | Parser +-------+ | ! +-------+ | ! +-------+ | Framework | ! | ! | || +- -- -- --! --+ ! +- -- -- --V --+ ! +- -- -- --V --+ || ! No Compiler ! ! !LALR Compiler ! ! !Optional Comp.! || +- -- -- --! --+ ! +- -- -- --! --+ ! +- -- -- --! --+ || | ! | ! | || +----------V---+ ! +----------V---+ ! +----------V---+ || | LL Parser | ! | LR Parser | ! | Parser Engine| !/ +--------------+ ! +--------------+ ! +--------------+ ! !  File: grammar-fw.info, Node: Grammar File, Next: Working with grammars, Prev: Overview, Up: Top Grammar File ************ A semantic grammar file has four main sections, and looks like this: %{ PROLOGUE %} DECLARATIONS %% GRAMMAR RULES %% EPILOGUE Comments are like Emacs Lisp ones but start with two consecutive semicolons (`;;') instead of a single one. The single semicolon (`;') is the end of rule delimiter. * Menu: * Terminal and Nonterminal Symbols:: * Prologue:: * Declarations:: * Grammar Rules:: * Epilogue::  File: grammar-fw.info, Node: Terminal and Nonterminal Symbols, Next: Prologue, Up: Grammar File Terminal and Nonterminal Symbols ================================ "Symbols" in semantic grammars represent the grammatical classifications of the language. They are valid Emacs Lisp symbols without special punctuations that require escaping. A "terminal symbol" (also known as a "lexical token class") represents a class of syntactically equivalent tokens. You use the symbol in grammar rules to mean that a token in that class is allowed. Bison's convention recommends it should be all upper case. Terminal symbols are part of the lexical tokens returned by the function `semantic-lex-analyzer'. They indicate what kind of token has been read. For more on lexical analysis, *note (semantic-langdev)Writing Lexers::. Terminal symbols are declared using `%keyword' (*note keyword Decl::) or `%token' (*note token Decl::) statements. *Please note:* Terminal symbols can be "literal characters" which have the same syntax used in C for character constants; for example, `'+'' is a "literal character token class". A character token class doesn't need to be declared, and can be used in precedence declarations like other terminal symbols (*note precedence Decl::). For now semantic lexers don't handle literal characters terminals. A "nonterminal symbol" stands for a class of syntactically equivalent groupings. The symbol name is used in writing grammar rules. Bison's convention recommends it should be all lower case. The symbol `error' is a terminal symbol reserved for error recovery; you shouldn't use it for any other purpose.  File: grammar-fw.info, Node: Prologue, Next: Declarations, Prev: Terminal and Nonterminal Symbols, Up: Grammar File Prologue ======== The PROLOGUE section contains definitions of Emacs Lisp variables, function and macros that are used in the actions in the grammar rules. These are copied to the beginning of the generated parser file so that they precede the definition of the grammar rules actions. You can use `require' to get the declarations from other libraries. If you don't need any Emacs Lisp declarations, you may omit this section. You may have more than one PROLOGUE section, intermixed with other DECLARATIONS. All PROLOGUE sections are copied to the generated parser file in the order they appear in the grammar.  File: grammar-fw.info, Node: Declarations, Next: Grammar Rules, Prev: Prologue, Up: Grammar File Declarations ============ The DECLARATIONS section contains declarations that define terminal symbols, specify precedence, and so on. In some simple grammars you may not need any declarations. Declarations of terminal symbols defines the symbols used in formulating the grammar and the type associated to categories of tokens (*note Terminal and Nonterminal Symbols::). * Menu: * package Decl:: * languagemode Decl:: * keyword Decl:: * put Decl:: * token Decl:: * type Decl:: * precedence Decl:: * default-prec Decl:: * quotemode Decl:: * scopestart Decl:: * start Decl:: * use-macros Decl::  File: grammar-fw.info, Node: package Decl, Next: languagemode Decl, Up: Declarations package Decl ------------ - %-Decl: %package library-name Declare the Emacs Lisp library created from the grammar. This will generate an Emacs Lisp file named `LIBRARY-NAME.el', and provide the LIBRARY-NAME feature at the end of the generated file with: (provide 'LIBRARY-NAME) All variable and function names generated from the DECLARATIONS section will be prefixed by `LIBRARY-NAME-', following Emacs standard coding conventions. If there is no `%package' statement, a default LIBRARY-NAME is used, of the form: GRAMMAR_FILENAME_SANS_EXTENTION-GRAMMAR_FILENAME_EXTENTION For instance, the default library name for the grammar in the `foo.wy' file is `foo-wy'.  File: grammar-fw.info, Node: languagemode Decl, Next: keyword Decl, Prev: package Decl, Up: Declarations languagemode Decl ----------------- - %-Decl: %languagemode MODE1... Declare in which major modes Emacs edits the sources that semantic parses using this grammar. For instance, the following declares that the grammar will be used to parse files edited in `c-mode' or `c++-mode' by Emacs. %languagemode c-mode c++-mode Typically, when a parser file is re-generated from a modified grammar, language modes are used to refresh local parser settings in buffers currently edited in those major modes.  File: grammar-fw.info, Node: keyword Decl, Next: put Decl, Prev: languagemode Decl, Up: Declarations keyword Decl ------------ - %-Decl: %keyword KEYWORD-NAME KEYWORD-VALUE Declare a language keyword (a reserved word). KEYWORD-NAME Is the terminal symbol used in grammar rules to represent this reserved word. KEYWORD-VALUE Is the actual value of the keyword as a string. Here is how the `if', `else', `endif' keywords might be declared: %keyword IF "if" %keyword ELSE "else" %keyword ENDIF "endif" Keywords have the implicit reserved type `keyword' (*note type Decl::). In the generated library, keyword declarations are defined in the constant `LIBRARY-NAME--keyword-table'. The keyword table value is an Emacs Lisp obarray, available at run time in the parsed buffer, in the buffer local variable `semantic-flex-keywords-obarray'. However you shouldn't use that variable directly. semantic provides the following API to use with language keywords at run time. - Function: semantic-lex-keyword-symbol name Return keyword symbol with NAME or `nil' if not found. Return `nil' otherwise. - Function: semantic-lex-keyword-p name Return non-`nil' if a keyword with NAME exists in the keyword table. Return `nil' otherwise. *Compatibility*: `semantic-lex-keyword-p' introduced in semantic version 2.0 supercedes `semantic-flex-keyword-p' which is now obsolete. - Function: semantic-lex-keyword-set name value Set value of keyword with NAME to VALUE and return VALUE. - Function: semantic-lex-keyword-value name Return value of keyword with NAME. Signal an error if a keyword with NAME does not exist. - Function: semantic-lex-map-keywords fun &optional property Call function FUN on every semantic keyword. If optional PROPERTY is non-`nil', call FUN only on every keyword which as a PROPERTY value. FUN receives a semantic keyword as argument. *Compatibility*: `semantic-lex-map-keywords' introduced in semantic version 2.0 supercedes `semantic-flex-map-keywords' which is now obsolete. - Function: semantic-lex-keywords &optional property Return a list of semantic keywords. If optional PROPERTY is non-`nil', return only keywords which have a PROPERTY set (*note put Decl::). *Compatibility*: `semantic-lex-keywords' introduced in semantic version 2.0 supercedes `semantic-flex-keywords' which is now obsolete.  File: grammar-fw.info, Node: put Decl, Next: token Decl, Prev: keyword Decl, Up: Declarations put Decl -------- The `%put' statement assigns properties to keywords (*note keyword Decl::). For instance, the predefined `summary' property assigns a help string to a keyword. The help string is used by `semantic-idle-summary-mode' for on-the-fly help. - %-Decl: %put KEYWORD-NAME PROPERTY VALUE - %-Decl: %put KEYWORD-NAME {PROPERTY1 VALUE1 ...} - %-Decl: %put {KEYWORD-NAME1 ...} PROPERTY VALUE - %-Decl: %put {KEYWORD-NAME1 ...} {PROPERTY1 VALUE1 ...} Give to KEYWORD-NAME, or to several KEYWORD-NAMEs, a single PROPERTY with VALUE, or a set of PROPERTYs with respective VALUEs. KEYWORD-NAME Is a terminal symbol defined as a keyword. PROPERTY Is a property name, which is a valid Emacs Lisp symbol. VALUE Is a property value, a valid Emacs Lisp constant expression. Keyword properties are stored in the keyword table (*note keyword Decl::). The following API can be used to handle properties at run time. - Function: semantic-lex-keyword-put name property value For keyword with NAME, set its PROPERTY to VALUE. *Compatibility*: `semantic-lex-keyword-put' introduced in semantic version 2.0 supercedes `semantic-flex-keyword-put' which is now obsolete. - Function: semantic-lex-keyword-get name property For keyword with NAME, return its PROPERTY value. *Compatibility*: `semantic-lex-keyword-get' introduced in semantic version 2.0 supercedes `semantic-flex-keyword-get' which is now obsolete.  File: grammar-fw.info, Node: token Decl, Next: type Decl, Prev: put Decl, Up: Declarations token Decl ---------- The `%token' statement declares a terminal symbol (a "token") which is not a keyword. - %-Decl: %token [] TOKEN-NAME MATCH-VALUE - %-Decl: %token [] TOKEN-NAME1 ... Respectively declare one token with an optional type, and a match value, or several tokens with the optional same type, and no match value. TYPE-NAME Is an optional symbol, enclosed between `<' and `>', that specifies (and implicitly declares) a "type" for this token (*note type Decl::). If omitted the token has no type. TOKEN-NAME Is the terminal symbol used in grammar rules to represent this token. MATCH-VALUE Is an optional string. Depending on TYPE-NAME properties, it will be interpreted as an ordinary string, a regular expression, or have a more elaborate meaning. If omitted the match value will be `nil', which means that this token will be considered as the default token of its type (*note type Decl:: for more information). *Please note:* For historical compatibility, the form `%token NAME VALUE' actually declares a keyword, and is strictly equivalent to `%keyword NAME VALUE'. Because the former is ambiguous and could be abandoned in future releases of semantic, we highly recommend to use the latter to declare keywords! In the generated library, token definitions are stored in the table of declared types (*note type Decl::).  File: grammar-fw.info, Node: type Decl, Next: precedence Decl, Prev: token Decl, Up: Declarations type Decl --------- - %-Decl: %type [PROPERTY1 VALUE1 ...] Explicitly declare a lexical type, and optionally give it properties. TYPE-NAME Is a symbol that identifies the type. PROPERTY Is a property name, a valid Emacs Lisp symbol. VALUE Is a property value, a valid Emacs Lisp constant expression. Even if `%token', `%keyword', and precedence declarations can implicitly declare types, an explicit declaration is required for every type: - To assign it properties. - To auto-generate a lexical rule that detects tokens of this type. For more information, *Note Auto-generation of lexical rules::. *Please note:* Because the grammar framework is implemented in Emacs Lisp, which is a dynamically typed language, the meaning of "type" is notably different between semantic and Bison. In Bison grammars, "type" means "data type", and associates an internal representation to lexical tokens. In semantic grammars, "type" specifies how lexical analysis will scan tokens of this type. In the generated library, lexical type declarations are defined in the constant `LIBRARY-NAME--token-table'. The table value is an Emacs Lisp obarray, available at run time in the parsed buffer, in the buffer local variable `semantic-lex-types-obarray'. However you shouldn't use that variable directly. semantic provides the following API to use with lexical types at run time. - Function: semantic-lex-type-symbol type Return symbol with TYPE or `nil' if not found. - Function: semantic-lex-type-p type Return non-`nil' if a symbol with TYPE name exists. - Function: semantic-lex-type-set type value Set value of symbol with TYPE name to VALUE and return VALUE. - Function: semantic-lex-type-value type &optional noerror Return value of symbol with TYPE name. If optional argument NOERROR is non-`nil' return `nil' if a symbol with TYPE name does not exist. Otherwise signal an error. - Function: semantic-lex-type-put type property value &optional add For symbol with TYPE name, set its PROPERTY to VALUE. If optional argument ADD is non-`nil', create a new symbol with TYPE name if it does not already exist. Otherwise signal an error. - Function: semantic-lex-type-get type property &optional noerror For symbol with TYPE name, return its PROPERTY value. If optional argument NOERROR is non-`nil' return `nil' if a symbol with TYPE name does not exist. Otherwise signal an error. - Function: semantic-lex-map-types fun &optional property Call function FUN on every lexical type. If optional PROPERTY is non-`nil', call FUN only on every type symbol which as a PROPERTY value. FUN receives a type symbol as argument. - Function: semantic-lex-types &optional property Return a list of lexical type symbols. If optional PROPERTY is non-`nil', return only type symbols which have PROPERTY set.  File: grammar-fw.info, Node: precedence Decl, Next: default-prec Decl, Prev: type Decl, Up: Declarations precedence Decl --------------- - %-Decl: %left [] TOKEN1 ... - %-Decl: %right [] TOKEN1 ... - %-Decl: %nonassoc [] TOKEN1 ... See also `%prec' in *Note Grammar Rules::. Specify the precedence and associativity of grammar terminals. For more details, *note (wisent)Grammar format::, PRECEDENCE. *Please note:* This information has meaning only in LALR grammars. Defining precedence in a grammar that provides tagging information for semantic isn't usually necessary.  File: grammar-fw.info, Node: default-prec Decl, Next: quotemode Decl, Prev: precedence Decl, Up: Declarations default-prec Decl ----------------- - %-Decl: %default-prec - %-Decl: %no-default-prec See also `%prec' in *Note Grammar Rules::. `%no-default-prec' declares that you must specify precedence for all rules that participate in precedence conflict resolution. The default `%default-prec' indicates to use the precedence of the last terminal symbol mentioned in the rule. For more details, *note (wisent)Grammar format::, PRECEDENCE. *Please note:* This information has meaning only in LALR grammars. Defining precedence in a grammar that provides tagging information for semantic isn't usually necessary.  File: grammar-fw.info, Node: quotemode Decl, Next: scopestart Decl, Prev: default-prec Decl, Up: Declarations quotemode Decl -------------- - %-Decl: %quotemode SYMBOL This is a mechanism used to specify how quoting worked in optional lambda expressions. *note (bovine)Optional Lambda Expression::. *Please note:* This information has meaning only in "bovine" grammars.  File: grammar-fw.info, Node: scopestart Decl, Next: start Decl, Prev: quotemode Decl, Up: Declarations scopestart Decl --------------- - %-Decl: %scopestart NONTERMINAL The `scopestart' declaration specifies the name of a nonterminal that is used for parsing the body of functional code. This is used in the local context parser to find locally defined variables. *Please note:* This information has meaning only in "bovine" grammars.  File: grammar-fw.info, Node: start Decl, Next: use-macros Decl, Prev: scopestart Decl, Up: Declarations start Decl ---------- - %-Decl: %start NONTERMINAL1 ... Declare the nonterminal symbols that the parser can use as its goal. *Please note:* The exact meaning of start symbols depends on the parser. For more information, *note (wisent)Start nonterminals::, and *note (bovine)Starting Rules::.  File: grammar-fw.info, Node: use-macros Decl, Prev: start Decl, Up: Declarations use-macros Decl --------------- - %-Decl: %use-macros LIBRARY {MACRO-NAME1 ...} Declare grammar macros local in this grammar. LIBRARY Is a symbol identifying the Elisp library where to find the macro expanders. MACRO-NAME Is a symbol identifying a macro. By convention it should be all upper case. For more information on macros, *Note Grammar Macros::. When the Lisp code generator encounters a `%use-macros' declaration, it automatically loads (`require') the specified LIBRARY and associates to each MACRO-NAME an expander function named LIBRARY-MACRO-NAME. For instance: %use-macros my-grammar-macros {MY-MACRO} Tells the Lisp code generator to first `(require 'my-grammar-macros)', then to call the function `my-grammar-macros-MY-MACRO' to expand calls to the `MY-MACRO' macro. Typically, the `my-grammar-macros.el' library should look like this: ... (defun my-grammar-macros-MY-MACRO (&rest args) "Grammar macro that passes ARGS to `format' and return a string." `(format ,@args)) ... ;; Don't forget to provide myself! (provide 'my-grammar-macros)  File: grammar-fw.info, Node: Grammar Rules, Next: Epilogue, Prev: Declarations, Up: Grammar File Grammar Rules ============= A grammar rule has the following general form: RESULT: COMPONENTS ... ; RESULT is the nonterminal symbol that this rule describes, and COMPONENTS are the various terminal and nonterminal symbols that are put together by this rule (*note Terminal and Nonterminal Symbols::). For example, this rule: exp: exp PLUS exp ; Says that two groupings of type `exp', with a `PLUS' token in between, can be combined into a larger grouping of type `exp'. Multiple rules for the same RESULT are joined with the vertical-bar character `|' as follows: RESULT: RULE1-COMPONENTS... | RULE2-COMPONENTS... ... ; If COMPONENTS in a rule is empty, it means that RESULT can match the empty string. For example, here is how to define a comma-separated sequence of zero or more `exp' groupings: expseq: ;;Empty | expseq1 ; expseq1: exp | expseq1 COMMA exp ; It is customary to write a comment `;;Empty' in each rule with no components. *Please note:* In LALR grammars, a `%prec' modifier can be written after the COMPONENTS of a rule to specify a terminal symbol whose precedence should be used for that rule. The syntax is: %prec TOKEN It assigns the rule the precedence of TOKEN, overriding the precedence that would be deduced for it in the ordinary way. For more details, *note (wisent)Grammar format::, PRECEDENCE. Here is a typical example: ... %left NEG ;; negation--unary minus ... %% ... exp: ... | '-' exp %prec NEG (- $2) ... ; Scattered among the components can be ACTIONS that determine the semantics of the rule. An action is an Emacs Lisp list form, for example: (cons $1 $2) To execute a sequence of forms, you can enclose them between braces like this: { (message "$2=%s" $2) $2 } Usually there is only one action and it follows the rule components. The code in an action can refer to the semantic values of the components matched by the rule with the construct `$N', which stands for the value of the Nth component. Here is a typical example: exp: ... | exp PLUS exp (+ $1 $3) ... ; This rule constructs an `exp' from two smaller `exp' groupings connected by a plus-sign token. In the action, `$1' and `$3' refer to the semantic values of the two component `exp' groupings, which are the first and third symbols on the right hand side of the rule. The sum becomes the semantic value of the addition-expression just recognized by the rule. If there were a useful semantic value associated with the `PLUS' token, it could be referred to as `$2'. Note that the vertical-bar character `|' is really a rule separator, and actions are attached to a single rule. By convention, if you don't specify an action for a rule, the value of the first symbol in the rule becomes the value of the whole rule: `(progn $1)'. The default value of an empty rule is `nil'. The exact default behavior depends on the parser. For more information, *note (wisent)Wisent::, and *note (bovine)Bovine::, manuals. *Please note:* In LALR grammars, you can have "mid-rule actions", that is semantic actions put in the middle of a rule. These actions are written just like usual end-of-rule actions, but they are executed before the parser even recognizes the following components. The mid-rule action itself counts as one of the components of the rule. This makes a difference when there is another action later in the same rule (and usually there is another at the end): you have to count the actions along with the symbols when working out which number N to use in `$N'. The mid-rule action can also have a semantic value, and actions later in the rule can refer to the value using `$N'. There is no way to set the value of the entire rule with a mid-rule action. The only way to set the value for the entire rule is with an ordinary action at the end of the rule. Here is an example taken from the `semantic-grammar.wy' grammar in the distribution: nonterminal: SYMBOL ;; Mid-rule action (setq semantic-grammar-wy--nterm $1 semantic-grammar-wy--rindx 0) COLON rules SEMI ;; End-of-rule action (TAG $1 'nonterminal :children $4) ;  File: grammar-fw.info, Node: Epilogue, Prev: Grammar Rules, Up: Grammar File Epilogue ======== The EPILOGUE section contains arbitrary Emacs Lisp code which is copied verbatim to the end of the parser file, just as the PROLOGUE is copied to the beginning. This is the most convenient place to put anything that you want to have in the parser file. For example, it could be convenient to put the definition of the lexer in the EPILOGUE, particularly if it includes lexical rules auto-generated from declarations in the grammar. If the EPILOGUE section is empty, you may omit the `%%' that separates it from the grammar rules. *Please note:* You can put a "footer" comment: ;;; my-grammar.by ends here At the end of the grammar. It will not be copied to the parser file to avoid confusion with the Emacs Lisp library own footer.  File: grammar-fw.info, Node: Working with grammars, Next: Adding a new grammar mode, Prev: Grammar File, Up: Top Working with grammars ********************* * Menu: * Editing grammars:: * Auto-generation of lexical rules:: * Grammar Macros:: * BY grammars:: * WY grammars::  File: grammar-fw.info, Node: Editing grammars, Next: Auto-generation of lexical rules, Up: Working with grammars Editing grammars ================ - Command: semantic-grammar-create-package &optional force Create package Lisp code from grammar in current buffer. Does nothing if the Lisp code seems up to date. If optional argument FORCE is non-`nil', unconditionally re-generate the Lisp code. You can run the command `semantic-grammar-create-package' with `C-c C-c' (or `C-u C-c C-c' to unconditionally re-generate the Lisp code). Additionally, when this command is run interactively, all open buffers of that mode have their setup functions re-run. That way after compiling your grammar, all relevant buffers will be actively using that grammar so you can test what you have done. - Command: semantic-grammar-indent Indent the current line. Use the Lisp or grammar indenter depending on point location. You can run the command `semantic-grammar-indent' with `'. - Command: semantic-grammar-complete Attempt to complete the symbol under point. Completion is position sensitive. If the cursor is in a match section of a rule, then nonterminals symbols are scanned. If the cursor is in a Lisp expression then Lisp symbols are completed. You can run the command `semantic-grammar-complete' with ` '. - Command: semantic-grammar-find-macro-expander macro-name library Visit the Emacs Lisp library where a grammar macro is implemented. MACRO-NAME is a symbol that identifies a grammar macro. LIBRARY is the name (sans extension) of the Emacs Lisp library where to start searching the macro implementation. Lookup in included libraries, if necessary. Find a function tag (in current tags table) whose name contains MACRO-NAME. Select the buffer containing the tag's definition, and move point there. *Please note:* Enabling the `global-semanticdb-minor-mode' is highly recommended because this command uses the `semanticdb-find' feature to automatically search for a macro expander function into included libraries. For more details, *note (semantic-appdev)Semantic Database::. You can run the command `semantic-grammar-find-macro-expander' with `C-c m'. The characters `| ; % ( ) :' are "electric punctuations". Each time you type one, the line is re-indented after the character is inserted. - Command: semantic-grammar-insert-keyword name Insert a new `%keyword' declaration with NAME. Assumes it is typed in with the correct casing. You can run the command `semantic-grammar-insert-keyword' with `C-c i k'.  File: grammar-fw.info, Node: Auto-generation of lexical rules, Next: Grammar Macros, Prev: Editing grammars, Up: Working with grammars Auto-generation of lexical rules ================================ Using a `%type' statement, combined with `%keyword' and `%token' ones, permits the declaration of a lexical type and associates it with patterns that define how to match lexical tokens of that type. The grammar construction process exploits that information to _automagically_ generate the definition of a lexical rule (aka a "single analyzer") for each explicitly declared lexical type. It is then easy to put predefined and auto-generated lexical rules together to build ad-hoc lexical analyzers. This chapter details the auto-generation process, and how you can control it to produce the lexical rules needed to scan your language. * Menu: * The principle:: * Type properties with a special meaning:: * Predefined well-known types::  File: grammar-fw.info, Node: The principle, Next: Type properties with a special meaning, Up: Auto-generation of lexical rules The principle -------------  File: grammar-fw.info, Node: Type properties with a special meaning, Next: Predefined well-known types, Prev: The principle, Up: Auto-generation of lexical rules Type properties with a special meaning -------------------------------------- `syntax' The value of the `syntax' property should be a "syntactic regexp", that is a regexp that matches buffer data based on the current Emacs "syntax table". For instance, to grab constituents of `symbol' or `keyword' types, the `syntax' property value will probably be: `"\\(\\sw\\|\\s_\\)+"' For well known types, the `syntax' property has a predefined value that should suit standard needs (*note Predefined well-known types::). `matchdatatype' The value of the `matchdatatype' property is a symbol that specifies which algorithm to use to match the tokens of this type. These match algorithms are defined: `regexp' `string' `block' `sexp' `keyword'  File: grammar-fw.info, Node: Predefined well-known types, Prev: Type properties with a special meaning, Up: Auto-generation of lexical rules Predefined well-known types --------------------------- Default values are provided for well known types like , , , , , and . Those types assume that the correct patterns are provided by %keyword and %token statements, a simple "%type " declaration should generally suffice to auto-generate a suitable lexical rule. `keyword' `symbol' `string' `number' `punctuation' `block'  File: grammar-fw.info, Node: Grammar Macros, Next: BY grammars, Prev: Auto-generation of lexical rules, Up: Working with grammars Grammar Macros ============== To keep grammars relatively independent of contextual implementations you can use "grammar macros" in semantic actions, in place of direct calls to Emacs Lisp functions. The predefined `TAG' macro is a good example: (TAG $1 'nonterminal :children $4) It offers a common syntax to produce a semantic tag, even if the underlying implementation is different between LALR and LL grammars. A grammar macro is defined by a general "macro name" (for example, `TAG') associated to possibly several "macro expanders". A macro expander is an Emacs Lisp function that operates on the unevaluated expressions for the arguments, and returns a Lisp expression containing these argument expressions or parts of them. Macro names are used like Lisp function names in call expressions. To help distinguish grammar macro calls from traditional Lisp function calls, macro names are automatically highlighted in semantic actions. For more goodies that might help in the development of macros, *Note Editing grammars::. Commonly used macros are provided by the current grammar mode. *Note Adding a new grammar mode::. There are also local macros that only affect the grammar where they are declared. *Note use-macros Decl::.  File: grammar-fw.info, Node: BY grammars, Next: WY grammars, Prev: Grammar Macros, Up: Working with grammars BY grammars =========== BY grammars are in files with the `.by' extension associated to the `bovine-grammar-mode' major mode. In that mode, grammars are converted into an Emacs Lisp form understandable by the original semantic LL parser. For more details on how the LL parser works, *note the Bovine Parser Manual: (bovine)top.  File: grammar-fw.info, Node: WY grammars, Prev: BY grammars, Up: Working with grammars WY grammars =========== WY grammars are in files with the `.wy' extension associated to the `wisent-grammar-mode' major mode. In that mode, grammars are converted into an Emacs Lisp form understandable by the semantic Bison-like LALR parser. For more details on how the LALR parser works, *note the Wisent Parser Manual: (wisent)top.  File: grammar-fw.info, Node: Adding a new grammar mode, Next: GNU Free Documentation License, Prev: Working with grammars, Up: Top Adding a new grammar mode ************************* * Menu: * Specialized Implementation:: * Querying grammars::  File: grammar-fw.info, Node: Specialized Implementation, Next: Querying grammars, Up: Adding a new grammar mode Specialized Implementation ========================== Each specialized grammar mode is responsible for implementing the following API. Implementations are done with _overload methods_. *note (semantic-langdev)Semantic Overload Mechanism::. * A keyword table builder. Can use the default implementation provided. - Function: semantic-grammar-keywordtable-builder Return the keyword table table value. This function can be overriden in semantic using the symbol `grammar-keywordtable-builder'. * A token table builder. Can use the default implementation provided. - Function: semantic-grammar-tokentable-builder Return the value of the table of lexical tokens. This function can be overriden in semantic using the symbol `grammar-tokentable-builder'. * A parser table builder Must be provided, there is no default implementation. - Function: semantic-grammar-parsetable-builder Return the parser table value. This function can be overriden in semantic using the symbol `grammar-parsetable-builder'. * A Parser setup code builder. Must be provided, there is no default implementation. - Function: semantic-grammar-setupcode-builder Return the parser setup code form. This function can be overriden in semantic using the symbol `grammar-setupcode-builder'. * Common grammar macros. A set of predefined macros is provided. For more information on macros, *Note Grammar Macros::. - Variable: semantic-grammar-macros List of associations (MACRO-NAME . EXPANDER).  File: grammar-fw.info, Node: Querying grammars, Prev: Specialized Implementation, Up: Adding a new grammar mode Querying grammars ================= In order to generate the intermediate Emacs Lisp code needed by your parser, you have to query the grammar for the code, declarations, and rules it contains. To obtain those information in a "context free" manner, grammar files are parsed by the semantic LALR parser to produce tags. To query a grammar you can use the powerful API provided by semantic. *note Application Development Manual: (semantic-appdev)top. For declarations, the grammar framework provides a specialized high level API. * Menu: * Grammar Tags:: * Querying Declarations::  File: grammar-fw.info, Node: Grammar Tags, Next: Querying Declarations, Up: Querying grammars Grammar Tags ------------ This section describes the available grammar tags. *note (semantic-appdev)Semantic Tags::. Each tag description has the form CLASS NAME OPTIONAL-ATTRIBUTES. - tag: code "prologue" Tag produced from a PROLOGUE section (*note Prologue::). The prologue code is the text in between tag's bounds. - tag: code "epilogue" Tag produced from the EPILOGUE section (*note Epilogue::). The epilogue code is the text in between tag's bounds. - tag: start NONTERMINAL1 REST-NONTERMINALS Tag produced from a `%start' declaration (*note start Decl::). NONTERMINAL1 First nonterminal symbol name. REST-NONTERMINALS List of nonterminal symbol names which follow the first one. Associated to the attribute `:rest'. - tag: languagemode MODE1 REST-MODES Tag produced from a `%languagemode' declaration (*note languagemode Decl::). MODE1 First mode symbol name. REST-MODES List of mode symbol names which follow the first one. Associated to the attribute `:rest'. - tag: scopestart NONTERMINAL Tag produced from a `%scopestart' declaration (*note scopestart Decl::). NONTERMINAL Nonterminal symbol name. - tag: quotemode SYMBOL Tag produced from a `%quotemode' declaration (*note quotemode Decl::). SYMBOL Symbol name, value of quote mode. - tag: assoc ASSOCIATIVITY TYPE-NAME TOKENS Tag produced from a `%left', `%right', or `%nonassoc' declaration (*note precedence Decl::). ASSOCIATIVITY "left", "right", or "nonassoc". TYPE-NAME Symbol name of the token type. Associated to the attribute `:type'. TOKENS List of terminal symbol names. Associated to the attribute `:value'. - tag: assoc "default-prec" FLAG Tag produced from a `%default-prec', or `%no-default-prec' declaration (*note default-prec Decl::). FLAG `(list "t")' or `(list "nil")' for respectively `%default-prec' and `%no-default-prec'. Associated to the attribute `:value'. - tag: keyword KEYWORD-NAME KEYWORD-VALUE Tag produced from a `%keyword' declaration (*note keyword Decl::). KEYWORD-VALUE Name of the keyword symbol. KEYWORD-VALUE Value of the keyword as a string. Associated to the attribute `:value'. - tag: token TOKEN-NAME1 TYPE-NAME MATCH-VALUE - tag: token TOKEN-NAME1 TYPE-NAME REST-TOKENS Tag produced from a `%token' declaration (*note token Decl::). TOKEN-NAME1 First or unique token symbol name. TYPE-NAME Symbol name of the token type. Associated to the attribute `:type'. MATCH-VALUE Value of the token as a string. Associated to the attribute `:value'. REST-TOKENS List of token symbol names which follow the first one in the declaration. Associated to the attribute `:rest'. - tag: type TYPE-NAME PROPERTIES Tag produced from a `%type' declaration (*note type Decl::). TYPE-NAME Symbol name of the type. PROPERTIES List of properties. Each property is a pair `(SYMBOL . VALUE)' where SYMBOL is the property symbol name, and VALUE the property value as string which contains a readable Lisp expression. Associated to the attribute `:value'. - tag: put KEYWORD-NAME1 REST-KEYWORDS PROPERTIES Tag produced from a `%put' declaration (*note put Decl::). KEYWORD-NAME1 First or unique keyword [or type] symbol name. REST-KEYWORDS List of keyword [or type] symbol names which follow the first one in the declaration. Associated to the attribute `:rest'. PROPERTIES List of properties. Each property is a pair `(SYMBOL . VALUE)' where SYMBOL is the property symbol name, and VALUE the property value as string which contains a readable Lisp expression. Associated to the attribute `:value'. - tag: macro "macro" LIBRARY MACRO-NAMES Tag produced from a `%use-macros' declaration (*note use-macros Decl::). LIBRARY Library name. Associated to the attribute `:type'. MACRO-NAMES List of macro names. Associated to the attribute `:value'. - tag: nonterminal NONTERMINAL RULE-TAGS Tag produced from a nonterminal description (*note Grammar Rules::). NONTERMINAL Nonterminal symbol name. RULE-TAGS List rule tags. Associated to the attribute `:children'. - tag: rule RULE-NAME :value COMPONENTS :prec TERMINAL :expr ACTION Tag produced from an individual grammar rule (*note Grammar Rules::). RULE-NAME Generated rule name. COMPONENTS List of rule's components. A component is a nonterminal/terminal symbol name, a string that represent a C-like character, or a list of one string element which contains the readable Lisp form of a mid-rule semantic action. Associated to the attribute `:value'. PREC Terminal symbol name or a string that represent a C-like character. Associated to the attribute `:prec'. ACTION String which contains the readable Lisp form of the main semantic action. Associated to the attribute `:expr'.  File: grammar-fw.info, Node: Querying Declarations, Prev: Grammar Tags, Up: Querying grammars Querying Declarations --------------------- This section describes the high level API to query grammar declarations. - Function: semantic-grammar-prologue Return grammar prologue code as a string value. *Note Prologue::. - Function: semantic-grammar-epilogue Return grammar epilogue code as a string value. *Note Epilogue::. - Function: semantic-grammar-package Return the `%package' value as a string. If there is no `%package' statement in the grammar, return a default package name derived from the grammar file name. *Note package Decl::. - Function: semantic-grammar-languagemode Return the `%languagemode' value as a list of symbols or `nil'. *Note languagemode Decl::. - Function: semantic-grammar-start Return the `%start' value as a list of symbols or `nil'. *Note start Decl::. - Function: semantic-grammar-scopestart Return the `%scopestart' value as a symbol or `nil'. *Note scopestart Decl::. - Function: semantic-grammar-quotemode Return the `%quotemode' value as a symbol or `nil'. *Note quotemode Decl::. - Function: semantic-grammar-keywords Return the language keywords. That is an alist of (VALUE . TOKEN) where VALUE is the string value of the keyword and TOKEN is the terminal symbol identifying the keyword. *Note keyword Decl::. - Function: semantic-grammar-keyword-properties keywords Return the list of KEYWORDS properties. *Note keyword Decl::, *Note put Decl::. - Function: semantic-grammar-tokens Return defined lexical tokens. That is an alist (TYPE . DEFS) where type is a `%token ' symbol and DEFS is an alist of (TOKEN . VALUE). TOKEN is the terminal symbol identifying the token and VALUE is the string value of the token or `nil'. *Note token Decl::. - Function: semantic-grammar-token-properties tokens Return properties of declared types. Types are explicitly declared by `%type' statements. Types found in TOKENS are those declared implicitly by `%token' statements. Properties can be set by `%put' and `%type' statements. Properties set by `%type' statements take precedence over those set by `%put' statements. *Note token Decl::, *Note type Decl::, *Note put Decl::. - Function: semantic-grammar-use-macros Return macro definitions from `%use-macros' statements. Also load the specified macro libraries. *Note use-macros Decl::.