This is grammar-fw.info, produced by makeinfo version 4.3 from
grammar-fw.texi.

This manual documents Grammar Development with Semantic.

   Copyright (C) 2004 Eric M. Ludlam Copyright (C) 2004 David Ponce
Copyright (C) 2004 Richard Y. Kim

     Permission is granted to copy, distribute and/or modify this
     document under the terms of the GNU Free Documentation License,
     Version 1.1 or any later version published by the Free Software
     Foundation; with the Invariant Sections being list their titles,
     with the Front-Cover Texts being list, and with the Back-Cover
     Texts being list.  A copy of the license is included in the
     section entitled "GNU Free Documentation License".
   
INFO-DIR-SECTION Emacs
START-INFO-DIR-ENTRY
* Semantic Grammar Framework: (grammar-fw).
END-INFO-DIR-ENTRY

   This file documents Grammar Development with Semantic.

   Copyright (C) 2004 Eric M. Ludlam, David Ponce, and Richard Y. Kim


File: grammar-fw.info,  Node: Top,  Next: Overview,  Up: (dir)

Semantic Grammar Framework Manual
*********************************

The Semantic Grammar Framework provides a consistent way to write rule
based grammars for use in Emacs.  This document describes how to use
the grammar writing environment, and how to write in the rule based
language.

* Menu:

* Overview::
* Grammar File::
* Working with grammars::
* Adding a new grammar mode::
* GNU Free Documentation License::
* Index::


File: grammar-fw.info,  Node: Overview,  Next: Grammar File,  Prev: Top,  Up: Top

Overview
********

semantic version 2 introduced a new grammar framework to provide a
clean and consistent view into parser writing.

All grammars are specified in a common rule based language derived from
the input grammar language used by the GNU parser generator Bison.  The
main differences are that:

   - The C-like syntax is replaced by Emacs Lisp syntax.

   - Percent declarations are specific to semantic.

Nevertheless, for those who are a familiar with Bison grammar syntax,
and have some knowledge of Emacs Lisp, writing semantic grammars won't
hard.  Moreover, the grammar framework provides Emacs goodies
(indentation, syntax coloring, etc.) to help edit grammars.

Grammars written in the common rule based language must be translated
into Emacs Lisp code so a semantic parser can use it.  The framework
defines a reusable and flexible API that simplifies the implementation
of grammar-to-lisp translators.

The _abstract_ major mode `semantic-grammar-mode' provides the core
functionalities to edit and translate input grammars.

_Concrete_ grammar modes, derived from `semantic-grammar-mode',
implement the common API that translates the input grammar format into
Emacs Lisp code understandable by a particular parser.  An unique file
extension associates input grammars to a _concrete_ grammar mode, that
is to a particular translator and parser.

semantic provides these _concrete_ grammar modes:

   - `bovine-grammar-mode', *Note BY grammars::.

   - `wisent-grammar-mode', *Note WY grammars::.

The following diagram presents a global view of the framework:


           `Bovine' parser  !  `Wisent' parser  !  Other parser     
                            !                   !                   
   /!    + - - - - - - - - -!- - - - - - - - - -!- - - - - - - - - +
   ||    !              Common Semantic Grammar Mode               !
   ||    ! ,,,,,,,,,,,,,,,, ! ,,,,,,,,,,,,,,,,  ! ,,,,,,,,,,,,,,,, !
Grammar  ! ; `BY' Grammar ; ! ; `WY' Grammar ;  ! ; `??' Grammar ; !
Framework! ;              ; ! ;              ;  ! ;              ; !
   ||    ! ;  Elisp Gen.  ; ! ;  Elisp Gen.  ;  ! ;  Elisp Gen.  ; !
   ||    ! '''''''!'''''''' ! '''''''!''''''''  ! '''''''!'''''''' !
   !/    + - - - -|- - - - -!- - - - | - - - - -!- - - - | - - - - +
                  |         !        |          !        |          
   /!       ,-----V------,  !  ,-----V------,   !  ,-----V------,   
   ||      .   Language   . ! .   Language   .  ! .   Language   .  
   ||      ` Support Lib. ' ! ` Support Lib. '  ! ` Support Lib. '  
   ||       `--!------!--'  !  `--!------!--'   !  `--!------!--'   
   ||          |      |     !     |      |      !     |      |      
   ||      +---V---+  |     ! +---V---+  |      ! +---V---+  |      
   ||      | Lexer |  |     ! | Lexer |  |      ! | Lexer |  |      
Parser     +-------+  |     ! +-------+  |      ! +-------+  |      
Framework             |     !            |      !            |      
   ||      +- -- -- --! --+ ! +- -- -- --V --+  ! +- -- -- --V --+  
   ||      ! No Compiler  ! ! !LALR Compiler !  ! !Optional Comp.!  
   ||      +- -- -- --! --+ ! +- -- -- --! --+  ! +- -- -- --! --+  
   ||                 |     !            |      !            |      
   ||      +----------V---+ ! +----------V---+  ! +----------V---+  
   ||      | LL Parser    | ! | LR Parser    |  ! | Parser Engine|  
   !/      +--------------+ ! +--------------+  ! +--------------+  
                            !                   !


File: grammar-fw.info,  Node: Grammar File,  Next: Working with grammars,  Prev: Overview,  Up: Top

Grammar File
************

A semantic grammar file has four main sections, and looks like this:

       %{
         PROLOGUE
       %}
     
       DECLARATIONS
     
       %%
       GRAMMAR RULES
       %%
     
       EPILOGUE

Comments are like Emacs Lisp ones but start with two consecutive
semicolons (`;;') instead of a single one. The single semicolon (`;')
is the end of rule delimiter.

* Menu:

* Terminal and Nonterminal Symbols::
* Prologue::
* Declarations::
* Grammar Rules::
* Epilogue::


File: grammar-fw.info,  Node: Terminal and Nonterminal Symbols,  Next: Prologue,  Up: Grammar File

Terminal and Nonterminal Symbols
================================

"Symbols" in semantic grammars represent the grammatical
classifications of the language.  They are valid Emacs Lisp symbols
without special punctuations that require escaping.

A "terminal symbol" (also known as a "lexical token class") represents
a class of syntactically equivalent tokens.  You use the symbol in
grammar rules to mean that a token in that class is allowed.  Bison's
convention recommends it should be all upper case.

Terminal symbols are part of the lexical tokens returned by the
function `semantic-lex-analyzer'.  They indicate what kind of token has
been read.  For more on lexical analysis, *note
(semantic-langdev)Writing Lexers::.

Terminal symbols are declared using `%keyword' (*note keyword Decl::)
or `%token' (*note token Decl::) statements.

*Please note:*
     Terminal symbols can be "literal characters" which have the same
     syntax used in C for character constants; for example, `'+'' is a
     "literal character token class".  A character token class doesn't
     need to be declared, and can be used in precedence declarations
     like other terminal symbols (*note precedence Decl::).
     For now semantic lexers don't handle literal characters terminals.

A "nonterminal symbol" stands for a class of syntactically equivalent
groupings.  The symbol name is used in writing grammar rules.  Bison's
convention recommends it should be all lower case.

The symbol `error' is a terminal symbol reserved for error recovery;
you shouldn't use it for any other purpose.


File: grammar-fw.info,  Node: Prologue,  Next: Declarations,  Prev: Terminal and Nonterminal Symbols,  Up: Grammar File

Prologue
========

The PROLOGUE section contains definitions of Emacs Lisp variables,
function and macros that are used in the actions in the grammar rules.

These are copied to the beginning of the generated parser file so that
they precede the definition of the grammar rules actions.  You can use
`require' to get the declarations from other libraries.  If you don't
need any Emacs Lisp declarations, you may omit this section.

You may have more than one PROLOGUE section, intermixed with other
DECLARATIONS.  All PROLOGUE sections are copied to the generated parser
file in the order they appear in the grammar.


File: grammar-fw.info,  Node: Declarations,  Next: Grammar Rules,  Prev: Prologue,  Up: Grammar File

Declarations
============

The DECLARATIONS section contains declarations that define terminal
symbols, specify precedence, and so on.  In some simple grammars you
may not need any declarations.

Declarations of terminal symbols defines the symbols used in
formulating the grammar and the type associated to categories of tokens
(*note Terminal and Nonterminal Symbols::).

* Menu:

* package Decl::
* languagemode Decl::
* keyword Decl::
* put Decl::
* token Decl::
* type Decl::
* precedence Decl::
* default-prec Decl::
* quotemode Decl::
* scopestart Decl::
* start Decl::
* use-macros Decl::


File: grammar-fw.info,  Node: package Decl,  Next: languagemode Decl,  Up: Declarations

package Decl
------------

 - %-Decl: %package library-name
     Declare the Emacs Lisp library created from the grammar.

     This will generate an Emacs Lisp file named `LIBRARY-NAME.el', and
     provide the LIBRARY-NAME feature at the end of the generated file
     with:

          (provide 'LIBRARY-NAME)

     All variable and function names generated from the DECLARATIONS
     section will be prefixed by `LIBRARY-NAME-', following Emacs
     standard coding conventions.

     If there is no `%package' statement, a default LIBRARY-NAME is
     used, of the form:

          GRAMMAR_FILENAME_SANS_EXTENTION-GRAMMAR_FILENAME_EXTENTION

     For instance, the default library name for the grammar in the
     `foo.wy' file is `foo-wy'.


File: grammar-fw.info,  Node: languagemode Decl,  Next: keyword Decl,  Prev: package Decl,  Up: Declarations

languagemode Decl
-----------------

 - %-Decl: %languagemode MODE1...
     Declare in which major modes Emacs edits the sources that semantic
     parses using this grammar.

     For instance, the following declares that the grammar will be used
     to parse files edited in `c-mode' or `c++-mode' by Emacs.

          %languagemode c-mode c++-mode

     Typically, when a parser file is re-generated from a modified
     grammar, language modes are used to refresh local parser settings
     in buffers currently edited in those major modes.


File: grammar-fw.info,  Node: keyword Decl,  Next: put Decl,  Prev: languagemode Decl,  Up: Declarations

keyword Decl
------------

 - %-Decl: %keyword KEYWORD-NAME KEYWORD-VALUE
     Declare a language keyword (a reserved word).

    KEYWORD-NAME
          Is the terminal symbol used in grammar rules to represent this
          reserved word.

    KEYWORD-VALUE
          Is the actual value of the keyword as a string.

     Here is how the `if', `else', `endif' keywords might be declared:

          %keyword IF    "if"
          %keyword ELSE  "else"
          %keyword ENDIF "endif"

     Keywords have the implicit reserved type `keyword' (*note type
     Decl::).

In the generated library, keyword declarations are defined in the
constant `LIBRARY-NAME--keyword-table'.  The keyword table value is an
Emacs Lisp obarray, available at run time in the parsed buffer, in the
buffer local variable `semantic-flex-keywords-obarray'.

However you shouldn't use that variable directly.  semantic provides
the following API to use with language keywords at run time.

 - Function: semantic-lex-keyword-symbol name
     Return keyword symbol with NAME or `nil' if not found.  Return
     `nil' otherwise.

 - Function: semantic-lex-keyword-p name
     Return non-`nil' if a keyword with NAME exists in the keyword
     table.  Return `nil' otherwise.

     *Compatibility*: `semantic-lex-keyword-p' introduced in semantic
     version 2.0 supercedes `semantic-flex-keyword-p' which is now
     obsolete.

 - Function: semantic-lex-keyword-set name value
     Set value of keyword with NAME to VALUE and return VALUE.

 - Function: semantic-lex-keyword-value name
     Return value of keyword with NAME.  Signal an error if a keyword
     with NAME does not exist.

 - Function: semantic-lex-map-keywords fun &optional property
     Call function FUN on every semantic keyword.  If optional PROPERTY
     is non-`nil', call FUN only on every keyword which as a PROPERTY
     value.  FUN receives a semantic keyword as argument.

     *Compatibility*: `semantic-lex-map-keywords' introduced in
     semantic version 2.0 supercedes `semantic-flex-map-keywords' which
     is now obsolete.

 - Function: semantic-lex-keywords &optional property
     Return a list of semantic keywords.  If optional PROPERTY is
     non-`nil', return only keywords which have a PROPERTY set (*note
     put Decl::).

     *Compatibility*: `semantic-lex-keywords' introduced in semantic
     version 2.0 supercedes `semantic-flex-keywords' which is now
     obsolete.


File: grammar-fw.info,  Node: put Decl,  Next: token Decl,  Prev: keyword Decl,  Up: Declarations

put Decl
--------

The `%put' statement assigns properties to keywords (*note keyword
Decl::).  For instance, the predefined `summary' property assigns a
help string to a keyword.  The help string is used by
`semantic-idle-summary-mode' for on-the-fly help.

 - %-Decl: %put KEYWORD-NAME PROPERTY VALUE
 - %-Decl: %put KEYWORD-NAME {PROPERTY1 VALUE1 ...}
 - %-Decl: %put {KEYWORD-NAME1 ...} PROPERTY VALUE
 - %-Decl: %put {KEYWORD-NAME1 ...} {PROPERTY1 VALUE1 ...}
     Give to KEYWORD-NAME, or to several KEYWORD-NAMEs, a single
     PROPERTY with VALUE, or a set of PROPERTYs with respective VALUEs.

    KEYWORD-NAME
          Is a terminal symbol defined as a keyword.

    PROPERTY
          Is a property name, which is a valid Emacs Lisp symbol.

    VALUE
          Is a property value, a valid Emacs Lisp constant expression.

Keyword properties are stored in the keyword table (*note keyword
Decl::).  The following API can be used to handle properties at run
time.

 - Function: semantic-lex-keyword-put name property value
     For keyword with NAME, set its PROPERTY to VALUE.

     *Compatibility*: `semantic-lex-keyword-put' introduced in semantic
     version 2.0 supercedes `semantic-flex-keyword-put' which is now
     obsolete.

 - Function: semantic-lex-keyword-get name property
     For keyword with NAME, return its PROPERTY value.

     *Compatibility*: `semantic-lex-keyword-get' introduced in semantic
     version 2.0 supercedes `semantic-flex-keyword-get' which is now
     obsolete.


File: grammar-fw.info,  Node: token Decl,  Next: type Decl,  Prev: put Decl,  Up: Declarations

token Decl
----------

The `%token' statement declares a terminal symbol (a "token") which is
not a keyword.

 - %-Decl: %token [<TYPE-NAME>] TOKEN-NAME MATCH-VALUE
 - %-Decl: %token [<TYPE-NAME>] TOKEN-NAME1 ...
     Respectively declare one token with an optional type, and a match
     value, or several tokens with the optional same type, and no match
     value.

    TYPE-NAME
          Is an optional symbol, enclosed between `<' and `>', that
          specifies (and implicitly declares) a "type" for this token
          (*note type Decl::).  If omitted the token has no type.

    TOKEN-NAME
          Is the terminal symbol used in grammar rules to represent
          this token.

    MATCH-VALUE
          Is an optional string.  Depending on TYPE-NAME properties, it
          will be interpreted as an ordinary string, a regular
          expression, or have a more elaborate meaning.  If omitted the
          match value will be `nil', which means that this token will
          be considered as the default token of its type (*note type
          Decl:: for more information).

*Please note:*
     For historical compatibility, the form `%token NAME VALUE'
     actually declares a keyword, and is strictly equivalent to
     `%keyword NAME VALUE'.  Because the former is ambiguous and could
     be abandoned in future releases of semantic, we highly recommend
     to use the latter to declare keywords!

In the generated library, token definitions are stored in the table of
declared types (*note type Decl::).


File: grammar-fw.info,  Node: type Decl,  Next: precedence Decl,  Prev: token Decl,  Up: Declarations

type Decl
---------

 - %-Decl: %type <TYPE-NAME> [PROPERTY1 VALUE1 ...]
     Explicitly declare a lexical type, and optionally give it
     properties.

    TYPE-NAME
          Is a symbol that identifies the type.

    PROPERTY
          Is a property name, a valid Emacs Lisp symbol.

    VALUE
          Is a property value, a valid Emacs Lisp constant expression.

     Even if `%token', `%keyword', and precedence declarations can
     implicitly declare types, an explicit declaration is required for
     every type:

        - To assign it properties.

        - To auto-generate a lexical rule that detects tokens of this
          type.  For more information, *Note Auto-generation of lexical
          rules::.

*Please note:*
     Because the grammar framework is implemented in Emacs Lisp, which
     is a dynamically typed language, the meaning of "type" is notably
     different between semantic and Bison.  In Bison grammars, "type"
     means "data type", and associates an internal representation to
     lexical tokens.  In semantic grammars, "type" specifies how
     lexical analysis will scan tokens of this type.

In the generated library, lexical type declarations are defined in the
constant `LIBRARY-NAME--token-table'.  The table value is an Emacs Lisp
obarray, available at run time in the parsed buffer, in the buffer
local variable `semantic-lex-types-obarray'.

However you shouldn't use that variable directly.  semantic provides
the following API to use with lexical types at run time.

 - Function: semantic-lex-type-symbol type
     Return symbol with TYPE or `nil' if not found.

 - Function: semantic-lex-type-p type
     Return non-`nil' if a symbol with TYPE name exists.

 - Function: semantic-lex-type-set type value
     Set value of symbol with TYPE name to VALUE and return VALUE.

 - Function: semantic-lex-type-value type &optional noerror
     Return value of symbol with TYPE name.  If optional argument
     NOERROR is non-`nil' return `nil' if a symbol with TYPE name does
     not exist.  Otherwise signal an error.

 - Function: semantic-lex-type-put type property value &optional add
     For symbol with TYPE name, set its PROPERTY to VALUE.  If optional
     argument ADD is non-`nil', create a new symbol with TYPE name if
     it does not already exist.  Otherwise signal an error.

 - Function: semantic-lex-type-get type property &optional noerror
     For symbol with TYPE name, return its PROPERTY value.  If optional
     argument NOERROR is non-`nil' return `nil' if a symbol with TYPE
     name does not exist.  Otherwise signal an error.

 - Function: semantic-lex-map-types fun &optional property
     Call function FUN on every lexical type.  If optional PROPERTY is
     non-`nil', call FUN only on every type symbol which as a PROPERTY
     value.  FUN receives a type symbol as argument.

 - Function: semantic-lex-types &optional property
     Return a list of lexical type symbols.  If optional PROPERTY is
     non-`nil', return only type symbols which have PROPERTY set.


File: grammar-fw.info,  Node: precedence Decl,  Next: default-prec Decl,  Prev: type Decl,  Up: Declarations

precedence Decl
---------------

 - %-Decl: %left [<TYPE-NAME>] TOKEN1 ...
 - %-Decl: %right [<TYPE-NAME>] TOKEN1 ...
 - %-Decl: %nonassoc [<TYPE-NAME>] TOKEN1 ...
     See also `%prec' in *Note Grammar Rules::.

     Specify the precedence and associativity of grammar terminals.

     For more details, *note (wisent)Grammar format::, PRECEDENCE.

*Please note:*
     This information has meaning only in LALR grammars.  Defining
     precedence in a grammar that provides tagging information for
     semantic isn't usually necessary.


File: grammar-fw.info,  Node: default-prec Decl,  Next: quotemode Decl,  Prev: precedence Decl,  Up: Declarations

default-prec Decl
-----------------

 - %-Decl: %default-prec
 - %-Decl: %no-default-prec
     See also `%prec' in *Note Grammar Rules::.

     `%no-default-prec' declares that you must specify precedence for
     all rules that participate in precedence conflict resolution.  The
     default `%default-prec' indicates to use the precedence of the
     last terminal symbol mentioned in the rule.

     For more details, *note (wisent)Grammar format::, PRECEDENCE.

*Please note:*
     This information has meaning only in LALR grammars.  Defining
     precedence in a grammar that provides tagging information for
     semantic isn't usually necessary.


File: grammar-fw.info,  Node: quotemode Decl,  Next: scopestart Decl,  Prev: default-prec Decl,  Up: Declarations

quotemode Decl
--------------

 - %-Decl: %quotemode SYMBOL
     This is a mechanism used to specify how quoting worked in optional
     lambda expressions.  *note (bovine)Optional Lambda Expression::.

*Please note:*
     This information has meaning only in "bovine" grammars.


File: grammar-fw.info,  Node: scopestart Decl,  Next: start Decl,  Prev: quotemode Decl,  Up: Declarations

scopestart Decl
---------------

 - %-Decl: %scopestart NONTERMINAL
     The `scopestart' declaration specifies the name of a nonterminal
     that is used for parsing the body of functional code.  This is used
     in the local context parser to find locally defined variables.

*Please note:*
     This information has meaning only in "bovine" grammars.


File: grammar-fw.info,  Node: start Decl,  Next: use-macros Decl,  Prev: scopestart Decl,  Up: Declarations

start Decl
----------

 - %-Decl: %start NONTERMINAL1 ...
     Declare the nonterminal symbols that the parser can use as its
     goal.

*Please note:*
     The exact meaning of start symbols depends on the parser.  For more
     information, *note (wisent)Start nonterminals::, and *note
     (bovine)Starting Rules::.


File: grammar-fw.info,  Node: use-macros Decl,  Prev: start Decl,  Up: Declarations

use-macros Decl
---------------

 - %-Decl: %use-macros LIBRARY {MACRO-NAME1 ...}
     Declare grammar macros local in this grammar.

    LIBRARY
          Is a symbol identifying the Elisp library where to find the
          macro expanders.

    MACRO-NAME
          Is a symbol identifying a macro.  By convention it should be
          all upper case.

     For more information on macros, *Note Grammar Macros::.

When the Lisp code generator encounters a `%use-macros' declaration, it
automatically loads (`require') the specified LIBRARY and associates to
each MACRO-NAME an expander function named LIBRARY-MACRO-NAME.

For instance:

     %use-macros my-grammar-macros {MY-MACRO}

Tells the Lisp code generator to first `(require 'my-grammar-macros)',
then to call the function `my-grammar-macros-MY-MACRO' to expand calls
to the `MY-MACRO' macro.

Typically, the `my-grammar-macros.el' library should look like this:

     ...
     
     (defun my-grammar-macros-MY-MACRO (&rest args)
       "Grammar macro that passes ARGS to `format' and return a string."
       `(format ,@args))
     
     ...
     
     ;; Don't forget to provide myself!
     (provide 'my-grammar-macros)


File: grammar-fw.info,  Node: Grammar Rules,  Next: Epilogue,  Prev: Declarations,  Up: Grammar File

Grammar Rules
=============

A grammar rule has the following general form:

     RESULT: COMPONENTS ...
           ;

RESULT is the nonterminal symbol that this rule describes, and
COMPONENTS are the various terminal and nonterminal symbols that are
put together by this rule (*note Terminal and Nonterminal Symbols::).

For example, this rule:

     exp: exp PLUS exp
        ;

Says that two groupings of type `exp', with a `PLUS' token in between,
can be combined into a larger grouping of type `exp'.

Multiple rules for the same RESULT are joined with the vertical-bar
character `|' as follows:

     RESULT: RULE1-COMPONENTS...
           | RULE2-COMPONENTS...
             ...
           ;

If COMPONENTS in a rule is empty, it means that RESULT can match the
empty string.  For example, here is how to define a comma-separated
sequence of zero or more `exp' groupings:

     expseq: ;;Empty
           | expseq1
           ;
     
     expseq1: exp
            | expseq1 COMMA exp
            ;

It is customary to write a comment `;;Empty' in each rule with no
components.

*Please note:*
     In LALR grammars, a `%prec' modifier can be written after the
     COMPONENTS of a rule to specify a terminal symbol whose precedence
     should be used for that rule.  The syntax is:

          %prec TOKEN

     It assigns the rule the precedence of TOKEN, overriding the
     precedence that would be deduced for it in the ordinary way.  For
     more details, *note (wisent)Grammar format::, PRECEDENCE.

     Here is a typical example:

          ...
          %left NEG     ;; negation--unary minus
          ...
          
          %%
          ...
          
          exp: ...
             | '-' exp %prec NEG
               (- $2)
             ...
             ;

Scattered among the components can be ACTIONS that determine the
semantics of the rule.  An action is an Emacs Lisp list form, for
example:

     (cons $1 $2)

To execute a sequence of forms, you can enclose them between braces
like this:

     {
       (message "$2=%s" $2)
       $2
      }

Usually there is only one action and it follows the rule components.

The code in an action can refer to the semantic values of the
components matched by the rule with the construct `$N', which stands
for the value of the Nth component.

Here is a typical example:

     exp: ...
        | exp PLUS exp
          (+ $1 $3)
        ...
        ;

This rule constructs an `exp' from two smaller `exp' groupings
connected by a plus-sign token.  In the action, `$1' and `$3' refer to
the semantic values of the two component `exp' groupings, which are the
first and third symbols on the right hand side of the rule.  The sum
becomes the semantic value of the addition-expression just recognized
by the rule.  If there were a useful semantic value associated with the
`PLUS' token, it could be referred to as `$2'.

Note that the vertical-bar character `|' is really a rule separator,
and actions are attached to a single rule.

By convention, if you don't specify an action for a rule, the value of
the first symbol in the rule becomes the value of the whole rule:
`(progn $1)'.  The default value of an empty rule is `nil'.

The exact default behavior depends on the parser.  For more
information, *note (wisent)Wisent::, and *note (bovine)Bovine::,
manuals.

*Please note:*
     In LALR grammars, you can have "mid-rule actions", that is
     semantic actions put in the middle of a rule.  These actions are
     written just like usual end-of-rule actions, but they are executed
     before the parser even recognizes the following components.

     The mid-rule action itself counts as one of the components of the
     rule.  This makes a difference when there is another action later
     in the same rule (and usually there is another at the end): you
     have to count the actions along with the symbols when working out
     which number N to use in `$N'.

     The mid-rule action can also have a semantic value, and actions
     later in the rule can refer to the value using `$N'.

     There is no way to set the value of the entire rule with a mid-rule
     action.  The only way to set the value for the entire rule is with
     an ordinary action at the end of the rule.

     Here is an example taken from the `semantic-grammar.wy' grammar in
     the distribution:

          nonterminal:
              SYMBOL
              ;; Mid-rule action
              (setq semantic-grammar-wy--nterm $1
                    semantic-grammar-wy--rindx 0)
              COLON rules SEMI
              ;; End-of-rule action
              (TAG $1 'nonterminal :children $4)
            ;


File: grammar-fw.info,  Node: Epilogue,  Prev: Grammar Rules,  Up: Grammar File

Epilogue
========

The EPILOGUE section contains arbitrary Emacs Lisp code which is copied
verbatim to the end of the parser file, just as the PROLOGUE is copied
to the beginning.  This is the most convenient place to put anything
that you want to have in the parser file.

For example, it could be convenient to put the definition of the lexer
in the EPILOGUE, particularly if it includes lexical rules
auto-generated from declarations in the grammar.

If the EPILOGUE section is empty, you may omit the `%%' that separates
it from the grammar rules.

*Please note:*
     You can put a "footer" comment:

          ;;; my-grammar.by ends here

     At the end of the grammar.  It will not be copied to the parser
     file to avoid confusion with the Emacs Lisp library own footer.


File: grammar-fw.info,  Node: Working with grammars,  Next: Adding a new grammar mode,  Prev: Grammar File,  Up: Top

Working with grammars
*********************

* Menu:

* Editing grammars::
* Auto-generation of lexical rules::
* Grammar Macros::
* BY grammars::
* WY grammars::


File: grammar-fw.info,  Node: Editing grammars,  Next: Auto-generation of lexical rules,  Up: Working with grammars

Editing grammars
================

 - Command: semantic-grammar-create-package &optional force
     Create package Lisp code from grammar in current buffer.  Does
     nothing if the Lisp code seems up to date.  If optional argument
     FORCE is non-`nil', unconditionally re-generate the Lisp code.

You can run the command `semantic-grammar-create-package' with
`C-c C-c' (or `C-u C-c C-c' to unconditionally re-generate the Lisp
code).

Additionally, when this command is run interactively, all open buffers
of that mode have their setup functions re-run.  That way after
compiling your grammar, all relevant buffers will be actively using
that grammar so you can test what you have done.

 - Command: semantic-grammar-indent
     Indent the current line.  Use the Lisp or grammar indenter
     depending on point location.

You can run the command `semantic-grammar-indent' with `<TAB>'.

 - Command: semantic-grammar-complete
     Attempt to complete the symbol under point.  Completion is
     position sensitive.  If the cursor is in a match section of a
     rule, then nonterminals symbols are scanned.  If the cursor is in
     a Lisp expression then Lisp symbols are completed.

You can run the command `semantic-grammar-complete' with `<META> <TAB>'.

 - Command: semantic-grammar-find-macro-expander macro-name library
     Visit the Emacs Lisp library where a grammar macro is implemented.
     MACRO-NAME is a symbol that identifies a grammar macro.  LIBRARY
     is the name (sans extension) of the Emacs Lisp library where to
     start searching the macro implementation.  Lookup in included
     libraries, if necessary.  Find a function tag (in current tags
     table) whose name contains MACRO-NAME.  Select the buffer
     containing the tag's definition, and move point there.

    *Please note:*
          Enabling the `global-semanticdb-minor-mode' is highly
          recommended because this command uses the `semanticdb-find'
          feature to automatically search for a macro expander function
          into included libraries.  For more details, *note
          (semantic-appdev)Semantic Database::.

You can run the command `semantic-grammar-find-macro-expander' with
`C-c m'.

The characters `| ; % ( ) :' are "electric punctuations".  Each time
you type one, the line is re-indented after the character is inserted.

 - Command: semantic-grammar-insert-keyword name
     Insert a new `%keyword' declaration with NAME.  Assumes it is
     typed in with the correct casing.

You can run the command `semantic-grammar-insert-keyword' with
`C-c i k'.


File: grammar-fw.info,  Node: Auto-generation of lexical rules,  Next: Grammar Macros,  Prev: Editing grammars,  Up: Working with grammars

Auto-generation of lexical rules
================================

Using a `%type' statement, combined with `%keyword' and `%token' ones,
permits the declaration of a lexical type and associates it with
patterns that define how to match lexical tokens of that type.

The grammar construction process exploits that information to
_automagically_ generate the definition of a lexical rule (aka a
"single analyzer") for each explicitly declared lexical type.

It is then easy to put predefined and auto-generated lexical rules
together to build ad-hoc lexical analyzers.

This chapter details the auto-generation process, and how you can
control it to produce the lexical rules needed to scan your language.

* Menu:

* The principle::
* Type properties with a special meaning::
* Predefined well-known types::


File: grammar-fw.info,  Node: The principle,  Next: Type properties with a special meaning,  Up: Auto-generation of lexical rules

The principle
-------------


File: grammar-fw.info,  Node: Type properties with a special meaning,  Next: Predefined well-known types,  Prev: The principle,  Up: Auto-generation of lexical rules

Type properties with a special meaning
--------------------------------------

`syntax'
     The value of the `syntax' property should be a "syntactic regexp",
     that is a regexp that matches buffer data based on the current
     Emacs "syntax table".

     For instance, to grab constituents of `symbol' or `keyword' types,
     the `syntax' property value will probably be:

     `"\\(\\sw\\|\\s_\\)+"'

     For well known types, the `syntax' property has a predefined value
     that should suit standard needs (*note Predefined well-known
     types::).

`matchdatatype'
     The value of the `matchdatatype' property is a symbol that
     specifies which algorithm to use to match the tokens of this type.
     These match algorithms are defined:

    `regexp'

    `string'

    `block'

    `sexp'

    `keyword'

File: grammar-fw.info,  Node: Predefined well-known types,  Prev: Type properties with a special meaning,  Up: Auto-generation of lexical rules

Predefined well-known types
---------------------------

Default values are provided for well known types like <keyword>,
<symbol>, <string>, <number>, <punctuation>, and <block>.  Those types
assume that the correct patterns are provided by %keyword and %token
statements, a simple "%type <type>" declaration should generally
suffice to auto-generate a suitable lexical rule.

`keyword'

`symbol'

`string'

`number'

`punctuation'

`block'

File: grammar-fw.info,  Node: Grammar Macros,  Next: BY grammars,  Prev: Auto-generation of lexical rules,  Up: Working with grammars

Grammar Macros
==============

To keep grammars relatively independent of contextual implementations
you can use "grammar macros" in semantic actions, in place of direct
calls to Emacs Lisp functions.

The predefined `TAG' macro is a good example:

     (TAG $1 'nonterminal :children $4)

It offers a common syntax to produce a semantic tag, even if the
underlying implementation is different between LALR and LL grammars.

A grammar macro is defined by a general "macro name" (for example,
`TAG') associated to possibly several "macro expanders".

A macro expander is an Emacs Lisp function that operates on the
unevaluated expressions for the arguments, and returns a Lisp
expression containing these argument expressions or parts of them.

Macro names are used like Lisp function names in call expressions.  To
help distinguish grammar macro calls from traditional Lisp function
calls, macro names are automatically highlighted in semantic actions.

For more goodies that might help in the development of macros, *Note
Editing grammars::.

Commonly used macros are provided by the current grammar mode.  *Note
Adding a new grammar mode::.

There are also local macros that only affect the grammar where they are
declared.  *Note use-macros Decl::.


File: grammar-fw.info,  Node: BY grammars,  Next: WY grammars,  Prev: Grammar Macros,  Up: Working with grammars

BY grammars
===========

BY grammars are in files with the `.by' extension associated to the
`bovine-grammar-mode' major mode.

In that mode, grammars are converted into an Emacs Lisp form
understandable by the original semantic LL parser.

For more details on how the LL parser works, *note the Bovine Parser
Manual: (bovine)top.


File: grammar-fw.info,  Node: WY grammars,  Prev: BY grammars,  Up: Working with grammars

WY grammars
===========

WY grammars are in files with the `.wy' extension associated to the
`wisent-grammar-mode' major mode.

In that mode, grammars are converted into an Emacs Lisp form
understandable by the semantic Bison-like LALR parser.

For more details on how the LALR parser works, *note the Wisent Parser
Manual: (wisent)top.


File: grammar-fw.info,  Node: Adding a new grammar mode,  Next: GNU Free Documentation License,  Prev: Working with grammars,  Up: Top

Adding a new grammar mode
*************************

* Menu:

* Specialized Implementation::
* Querying grammars::


File: grammar-fw.info,  Node: Specialized Implementation,  Next: Querying grammars,  Up: Adding a new grammar mode

Specialized Implementation
==========================

Each specialized grammar mode is responsible for implementing the
following API.  Implementations are done with _overload methods_.
*note (semantic-langdev)Semantic Overload Mechanism::.

   * A keyword table builder.  Can use the default implementation
     provided.

      - Function: semantic-grammar-keywordtable-builder
          Return the keyword table table value.  This function can be
          overriden in semantic using the symbol
          `grammar-keywordtable-builder'.

   * A token table builder.  Can use the default implementation
     provided.

      - Function: semantic-grammar-tokentable-builder
          Return the value of the table of lexical tokens.  This
          function can be overriden in semantic using the symbol
          `grammar-tokentable-builder'.

   * A parser table builder Must be provided, there is no default
     implementation.

      - Function: semantic-grammar-parsetable-builder
          Return the parser table value.  This function can be
          overriden in semantic using the symbol
          `grammar-parsetable-builder'.

   * A Parser setup code builder.  Must be provided, there is no
     default implementation.

      - Function: semantic-grammar-setupcode-builder
          Return the parser setup code form.  This function can be
          overriden in semantic using the symbol
          `grammar-setupcode-builder'.

   * Common grammar macros.  A set of predefined macros is provided.

     For more information on macros, *Note Grammar Macros::.

      - Variable: semantic-grammar-macros
          List of associations (MACRO-NAME . EXPANDER).



File: grammar-fw.info,  Node: Querying grammars,  Prev: Specialized Implementation,  Up: Adding a new grammar mode

Querying grammars
=================

In order to generate the intermediate Emacs Lisp code needed by your
parser, you have to query the grammar for the code, declarations, and
rules it contains.

To obtain those information in a "context free" manner, grammar files
are parsed by the semantic LALR parser to produce tags.

To query a grammar you can use the powerful API provided by semantic.
*note Application Development Manual: (semantic-appdev)top.

For declarations, the grammar framework provides a specialized high
level API.

* Menu:

* Grammar Tags::
* Querying Declarations::


File: grammar-fw.info,  Node: Grammar Tags,  Next: Querying Declarations,  Up: Querying grammars

Grammar Tags
------------

This section describes the available grammar tags.  *note
(semantic-appdev)Semantic Tags::.

Each tag description has the form CLASS NAME OPTIONAL-ATTRIBUTES.

 - tag: code "prologue"
     Tag produced from a PROLOGUE section (*note Prologue::).  The
     prologue code is the text in between tag's bounds.

 - tag: code "epilogue"
     Tag produced from the EPILOGUE section (*note Epilogue::).  The
     epilogue code is the text in between tag's bounds.

 - tag: start NONTERMINAL1 REST-NONTERMINALS
     Tag produced from a `%start' declaration (*note start Decl::).

    NONTERMINAL1
          First nonterminal symbol name.

    REST-NONTERMINALS
          List of nonterminal symbol names which follow the first one.
          Associated to the attribute `:rest'.

 - tag: languagemode MODE1 REST-MODES
     Tag produced from a `%languagemode' declaration (*note
     languagemode Decl::).

    MODE1
          First mode symbol name.

    REST-MODES
          List of mode symbol names which follow the first one.
          Associated to the attribute `:rest'.

 - tag: scopestart NONTERMINAL
     Tag produced from a `%scopestart' declaration (*note scopestart
     Decl::).

    NONTERMINAL
          Nonterminal symbol name.

 - tag: quotemode SYMBOL
     Tag produced from a `%quotemode' declaration (*note quotemode
     Decl::).

    SYMBOL
          Symbol name, value of quote mode.

 - tag: assoc ASSOCIATIVITY TYPE-NAME TOKENS
     Tag produced from a `%left', `%right', or `%nonassoc' declaration
     (*note precedence Decl::).

    ASSOCIATIVITY
          "left", "right", or "nonassoc".

    TYPE-NAME
          Symbol name of the token type.  Associated to the attribute
          `:type'.

    TOKENS
          List of terminal symbol names.  Associated to the attribute
          `:value'.

 - tag: assoc "default-prec" FLAG
     Tag produced from a `%default-prec', or `%no-default-prec'
     declaration (*note default-prec Decl::).

    FLAG
          `(list "t")' or `(list "nil")' for respectively
          `%default-prec' and `%no-default-prec'.  Associated to the
          attribute `:value'.

 - tag: keyword KEYWORD-NAME KEYWORD-VALUE
     Tag produced from a `%keyword' declaration (*note keyword Decl::).

    KEYWORD-VALUE
          Name of the keyword symbol.

    KEYWORD-VALUE
          Value of the keyword as a string.  Associated to the
          attribute `:value'.

 - tag: token TOKEN-NAME1 TYPE-NAME MATCH-VALUE
 - tag: token TOKEN-NAME1 TYPE-NAME REST-TOKENS
     Tag produced from a `%token' declaration (*note token Decl::).

    TOKEN-NAME1
          First or unique token symbol name.

    TYPE-NAME
          Symbol name of the token type.  Associated to the attribute
          `:type'.

    MATCH-VALUE
          Value of the token as a string.  Associated to the attribute
          `:value'.

    REST-TOKENS
          List of token symbol names which follow the first one in the
          declaration.  Associated to the attribute `:rest'.

 - tag: type TYPE-NAME PROPERTIES
     Tag produced from a `%type' declaration (*note type Decl::).

    TYPE-NAME
          Symbol name of the type.

    PROPERTIES
          List of properties.  Each property is a pair
          `(SYMBOL . VALUE)' where SYMBOL is the property symbol name,
          and VALUE the property value as string which contains a
          readable Lisp expression.  Associated to the attribute
          `:value'.

 - tag: put KEYWORD-NAME1 REST-KEYWORDS PROPERTIES
     Tag produced from a `%put' declaration (*note put Decl::).

    KEYWORD-NAME1
          First or unique keyword [or type] symbol name.

    REST-KEYWORDS
          List of keyword [or type] symbol names which follow the first
          one in the declaration.  Associated to the attribute `:rest'.

    PROPERTIES
          List of properties.  Each property is a pair
          `(SYMBOL . VALUE)' where SYMBOL is the property symbol name,
          and VALUE the property value as string which contains a
          readable Lisp expression.  Associated to the attribute
          `:value'.

 - tag: macro "macro" LIBRARY MACRO-NAMES
     Tag produced from a `%use-macros' declaration (*note use-macros
     Decl::).

    LIBRARY
          Library name.  Associated to the attribute `:type'.

    MACRO-NAMES
          List of macro names.  Associated to the attribute `:value'.

 - tag: nonterminal NONTERMINAL RULE-TAGS
     Tag produced from a nonterminal description (*note Grammar
     Rules::).

    NONTERMINAL
          Nonterminal symbol name.

    RULE-TAGS
          List rule tags.  Associated to the attribute `:children'.

 - tag: rule RULE-NAME :value COMPONENTS :prec TERMINAL :expr ACTION
     Tag produced from an individual grammar rule (*note Grammar
     Rules::).

    RULE-NAME
          Generated rule name.

    COMPONENTS
          List of rule's components.  A component is a
          nonterminal/terminal symbol name, a string that represent a
          C-like character, or a list of one string element which
          contains the readable Lisp form of a mid-rule semantic
          action.  Associated to the attribute `:value'.

    PREC
          Terminal symbol name or a string that represent a C-like
          character.  Associated to the attribute `:prec'.

    ACTION
          String which contains the readable Lisp form of the main
          semantic action.  Associated to the attribute `:expr'.


File: grammar-fw.info,  Node: Querying Declarations,  Prev: Grammar Tags,  Up: Querying grammars

Querying Declarations
---------------------

This section describes the high level API to query grammar declarations.

 - Function: semantic-grammar-prologue
     Return grammar prologue code as a string value.
     *Note Prologue::.

 - Function: semantic-grammar-epilogue
     Return grammar epilogue code as a string value.
     *Note Epilogue::.

 - Function: semantic-grammar-package
     Return the `%package' value as a string.  If there is no
     `%package' statement in the grammar, return a default package name
     derived from the grammar file name.
     *Note package Decl::.

 - Function: semantic-grammar-languagemode
     Return the `%languagemode' value as a list of symbols or `nil'.
     *Note languagemode Decl::.

 - Function: semantic-grammar-start
     Return the `%start' value as a list of symbols or `nil'.
     *Note start Decl::.

 - Function: semantic-grammar-scopestart
     Return the `%scopestart' value as a symbol or `nil'.
     *Note scopestart Decl::.

 - Function: semantic-grammar-quotemode
     Return the `%quotemode' value as a symbol or `nil'.
     *Note quotemode Decl::.

 - Function: semantic-grammar-keywords
     Return the language keywords.  That is an alist of (VALUE . TOKEN)
     where VALUE is the string value of the keyword and TOKEN is the
     terminal symbol identifying the keyword.
     *Note keyword Decl::.

 - Function: semantic-grammar-keyword-properties keywords
     Return the list of KEYWORDS properties.
     *Note keyword Decl::, *Note put Decl::.

 - Function: semantic-grammar-tokens
     Return defined lexical tokens.  That is an alist (TYPE . DEFS)
     where type is a `%token <type>' symbol and DEFS is an alist of
     (TOKEN . VALUE).  TOKEN is the terminal symbol identifying the
     token and VALUE is the string value of the token or `nil'.
     *Note token Decl::.

 - Function: semantic-grammar-token-properties tokens
     Return properties of declared types.  Types are explicitly
     declared by `%type' statements.  Types found in TOKENS are those
     declared implicitly by `%token' statements.  Properties can be set
     by `%put' and `%type' statements.  Properties set by `%type'
     statements take precedence over those set by `%put' statements.
     *Note token Decl::, *Note type Decl::, *Note put Decl::.

 - Function: semantic-grammar-use-macros
     Return macro definitions from `%use-macros' statements.  Also load
     the specified macro libraries.
     *Note use-macros Decl::.