1
Fork 0
mirror of git://git.sv.gnu.org/emacs.git synced 2026-01-04 19:10:37 -08:00
emacs/doc/misc/bovine.texi
Paul Eggert c33d89cc64 Fix single-quoting style in PDF manuals
The PDF versions of the GNU manuals used curved single quotes to
represent grave accent and apostrophe, which made it a pain to cut
and paste code examples from them.  Fix the PDF versions to use
grave accent and apostrophe for Lisp source code, keystrokes, etc.
This change does not affect the info files, nor does it affect
ordinary uses of curved single quotes in PDF.
* doc/emacs/docstyle.texi: New file, which specifies treatment for
grave accent and apostrophe, as well as the document encoding.
* doc/emacs/emacs-xtra.texi, doc/emacs/emacs.texi:
* doc/lispintro/emacs-lisp-intro.texi:
* doc/lispref/back.texi, doc/lispref/book-spine.texi:
* doc/lispref/elisp.texi, doc/lispref/lay-flat.texi:
* doc/misc/ada-mode.texi, doc/misc/auth.texi:
* doc/misc/autotype.texi, doc/misc/bovine.texi, doc/misc/calc.texi:
* doc/misc/cc-mode.texi, doc/misc/cl.texi, doc/misc/dbus.texi:
* doc/misc/dired-x.texi, doc/misc/ebrowse.texi, doc/misc/ede.texi:
* doc/misc/ediff.texi, doc/misc/edt.texi, doc/misc/efaq-w32.texi:
* doc/misc/efaq.texi, doc/misc/eieio.texi, doc/misc/emacs-gnutls.texi:
* doc/misc/emacs-mime.texi, doc/misc/epa.texi, doc/misc/erc.texi:
* doc/misc/ert.texi, doc/misc/eshell.texi, doc/misc/eudc.texi:
* doc/misc/eww.texi, doc/misc/flymake.texi, doc/misc/forms.texi:
* doc/misc/gnus-coding.texi, doc/misc/gnus-faq.texi:
* doc/misc/gnus.texi, doc/misc/htmlfontify.texi:
* doc/misc/idlwave.texi, doc/misc/ido.texi, doc/misc/info.texi:
* doc/misc/mairix-el.texi, doc/misc/message.texi, doc/misc/mh-e.texi:
* doc/misc/newsticker.texi, doc/misc/nxml-mode.texi:
* doc/misc/octave-mode.texi, doc/misc/org.texi, doc/misc/pcl-cvs.texi:
* doc/misc/pgg.texi, doc/misc/rcirc.texi, doc/misc/reftex.texi:
* doc/misc/remember.texi, doc/misc/sasl.texi, doc/misc/sc.texi:
* doc/misc/semantic.texi, doc/misc/ses.texi, doc/misc/sieve.texi:
* doc/misc/smtpmail.texi, doc/misc/speedbar.texi:
* doc/misc/srecode.texi, doc/misc/todo-mode.texi, doc/misc/tramp.texi:
* doc/misc/url.texi, doc/misc/vhdl-mode.texi, doc/misc/vip.texi:
* doc/misc/viper.texi, doc/misc/widget.texi, doc/misc/wisent.texi:
* doc/misc/woman.texi:
Use it instead of '@documentencoding UTF-8', to lessen the need for
global changes like this in the future.
* doc/emacs/Makefile.in (EMACS_XTRA):
* doc/lispintro/Makefile.in (srcs):
* doc/lispref/Makefile.in (srcs):
Add dependency on docstyle.texi.
* doc/misc/Makefile.in (style): New macro.
(${buildinfodir}/%.info, %.dvi, %.pdf, %.html)
(${buildinfodir}/ccmode.info, ${buildinfodir}/efaq%.info, gnus_deps):
Use it.
2015-05-01 12:06:38 -07:00

475 lines
14 KiB
Text

\input texinfo @c -*-texinfo-*-
@c %**start of header
@setfilename ../../info/bovine.info
@set TITLE Bovine parser development
@set AUTHOR Eric M. Ludlam, David Ponce, and Richard Y. Kim
@settitle @value{TITLE}
@include docstyle.texi
@c *************************************************************************
@c @ Header
@c *************************************************************************
@c Merge all indexes into a single index for now.
@c We can always separate them later into two or more as needed.
@syncodeindex vr cp
@syncodeindex fn cp
@syncodeindex ky cp
@syncodeindex pg cp
@syncodeindex tp cp
@c @footnotestyle separate
@c @paragraphindent 2
@c @@smallbook
@c %**end of header
@copying
Copyright @copyright{} 1999--2004, 2012--2015 Free Software Foundation, Inc.
@quotation
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with no
Invariant Sections, with the Front-Cover Texts being ``A GNU Manual,''
and with the Back-Cover Texts as in (a) below. A copy of the license
is included in the section entitled ``GNU Free Documentation License''.
(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
modify this GNU manual.''
@end quotation
@end copying
@dircategory Emacs misc features
@direntry
* Bovine: (bovine). Semantic bovine parser development.
@end direntry
@iftex
@finalout
@end iftex
@c @setchapternewpage odd
@c @setchapternewpage off
@titlepage
@sp 10
@title @value{TITLE}
@author by @value{AUTHOR}
@page
@vskip 0pt plus 1 fill
@insertcopying
@end titlepage
@page
@macro semantic{}
@i{Semantic}
@end macro
@c *************************************************************************
@c @ Document
@c *************************************************************************
@contents
@node top
@top @value{TITLE}
The @dfn{bovine} parser is the original @semantic{} parser, and is an
implementation of an @acronym{LL} parser. It is good for simple
languages. It has many conveniences making grammar writing easy. The
conveniences make it less powerful than a Bison-like @acronym{LALR}
parser. For more information, @inforef{Top, The Wisent Parser Manual,
wisent}.
Bovine @acronym{LL} grammars are stored in files with a @file{.by}
extension. When compiled, the contents is converted into a file of
the form @file{NAME-by.el}. This, in turn is byte compiled.
@inforef{top, Grammar Framework Manual, grammar-fw}.
@ifnottex
@insertcopying
@end ifnottex
@menu
* Starting Rules:: The starting rules for the grammar.
* Bovine Grammar Rules:: Rules used to parse a language.
* Optional Lambda Expression:: Actions to take when a rule is matched.
* Bovine Examples:: Simple Samples.
* GNU Free Documentation License:: The license for this documentation.
@c * Index::
@end menu
@node Starting Rules
@chapter Starting Rules
In Bison, one and only one nonterminal is designated as the ``start''
symbol. In @semantic{}, one or more nonterminals can be designated as
the ``start'' symbol. They are declared following the @code{%start}
keyword separated by spaces. @inforef{start Decl, ,grammar-fw}.
If no @code{%start} keyword is used in a grammar, then the very first
is used. Internally the first start nonterminal is targeted by the
reserved symbol @code{bovine-toplevel}, so it can be found by the
parser harness.
To find locally defined variables, the local context handler needs to
parse the body of functional code. The @code{scopestart} declaration
specifies the name of a nonterminal used as the goal to parse a local
context, @inforef{scopestart Decl, ,grammar-fw}. Internally the
scopestart nonterminal is targeted by the reserved symbol
@code{bovine-inner-scope}, so it can be found by the parser harness.
@node Bovine Grammar Rules
@chapter Bovine Grammar Rules
The rules are what allow the compiler to create tags from a language
file. Once the setup is done in the prologue, you can start writing
rules. @inforef{Grammar Rules, ,grammar-fw}.
@example
@var{result} : @var{components1} @var{optional-semantic-action1})
| @var{components2} @var{optional-semantic-action2}
;
@end example
@var{result} is a nonterminal, that is a symbol synthesized in your grammar.
@var{components} is a list of elements that are to be matched if @var{result}
is to be made. @var{optional-semantic-action} is an optional sequence
of simplified Emacs Lisp expressions for concocting the parse tree.
In bison, each time an element of @var{components} is found, it is
@dfn{shifted} onto the parser stack. (The stack of matched elements.)
When all @var{components}' elements have been matched, it is
@dfn{reduced} to @var{result}. @xref{Algorithm,,, bison, The GNU Bison Manual}.
A particular @var{result} written into your grammar becomes
the parser's goal. It is designated by a @code{%start} statement
(@pxref{Starting Rules}). The value returned by the associated
@var{optional-semantic-action} is the parser's result. It should be
a tree of @semantic{} @dfn{tags}, @inforef{Semantic Tags, ,
semantic-appdev}.
@var{components} is made up of symbols. A symbol such as @code{FOO}
means that a syntactic token of class @code{FOO} must be matched.
@menu
* How Lexical Tokens Match::
* Grammar-to-Lisp Details::
* Order of components in rules::
@end menu
@node How Lexical Tokens Match
@section How Lexical Tokens Match
A lexical rule must be used to define how to match a lexical token.
For instance:
@example
%keyword FOO "foo"
@end example
Means that @code{FOO} is a reserved language keyword, matched as such
by looking up into a keyword table, @inforef{keyword Decl,
,grammar-fw}. This is because @code{"foo"} will be converted to
@code{FOO} in the lexical analysis stage. Thus the symbol @code{FOO}
won't be available any other way.
If we specify our token in this way:
@example
%token <symbol> FOO "foo"
@end example
then @code{FOO} will match the string @code{"foo"} explicitly, but it
won't do so at the lexical level, allowing use of the text
@code{"foo"} in other forms of regular expressions.
In that case, @code{FOO} is a @code{symbol}-type token. To match, a
@code{symbol} must first be encountered, and then it must
@code{string-match "foo"}.
@table @strong
@item Caution:
Be especially careful to remember that @code{"foo"}, and more
generally the %token's match-value string, is a regular expression!
@end table
Non symbol tokens are also allowed. For example:
@example
%token <punctuation> PERIOD "[.]"
filename : symbol PERIOD symbol
;
@end example
@code{PERIOD} is a @code{punctuation}-type token that will explicitly
match one period when used in the above rule.
@table @strong
@item Please Note:
@code{symbol}, @code{punctuation}, etc., are predefined lexical token
types, based on the @dfn{syntax class}-character associations
currently in effect.
@end table
@node Grammar-to-Lisp Details
@section Grammar-to-Lisp Details
For the bovinator, lexical token matching patterns are @emph{inlined}.
When the grammar-to-lisp converter encounters a lexical token
declaration of the form:
@example
%token <@var{type}> @var{token-name} @var{match-value}
@end example
It substitutes every occurrences of @var{token-name} in rules, by its
expanded form:
@example
@var{type} @var{match-value}
@end example
For example:
@example
%token <symbol> MOOSE "moose"
find_a_moose: MOOSE
;
@end example
Will generate this pseudo equivalent-rule:
@example
find_a_moose: symbol "moose" ;; invalid syntax!
;
@end example
Thus, from the bovinator point of view, the @var{components} part of a
rule is made up of symbols and strings. A string in the mix means
that the previous symbol must have the additional constraint of
exactly matching it, as described in @ref{How Lexical Tokens Match}.
@table @strong
@item Please Note:
For the bovinator, this task was mixed into the language definition to
simplify implementation, though Bison's technique is more efficient.
@end table
@node Order of components in rules
@section Order of components in rules
If a rule has multiple components, order is important, for example
@example
headerfile : symbol PERIOD symbol
| symbol
;
@end example
would match @samp{foo.h} or the @acronym{C++} header @samp{foo}.
The bovine parser will first attempt to match the long form, and then
the short form. If they were in reverse order, then the long form
would never be tested.
@c @xref{Default syntactic tokens}.
@node Optional Lambda Expression
@chapter Optional Lambda Expressions
The @acronym{OLE} (@dfn{Optional Lambda Expression}) is converted into
a bovine lambda. This lambda has special short-cuts to simplify
reading the semantic action definition. An @acronym{OLE} like this:
@example
( $1 )
@end example
results in a lambda return which consists entirely of the string
or object found by matching the first (zeroth) element of match.
An @acronym{OLE} like this:
@example
( ,(foo $1) )
@end example
executes @code{foo} on the first argument, and then splices its return
into the return list whereas:
@example
( (foo $1) )
@end example
executes @code{foo}, and that is placed in the return list.
Here are other things that can appear inline:
@table @code
@item $1
The first object matched.
@item ,$1
The first object spliced into the list (assuming it is a list from a
non-terminal).
@item '$1
The first object matched, placed in a list. I.e., @code{( $1 )}.
@item foo
The symbol @code{foo} (exactly as displayed).
@item (foo)
A function call to foo which is stuck into the return list.
@item ,(foo)
A function call to foo which is spliced into the return list.
@item '(foo)
A function call to foo which is stuck into the return list in a list.
@item (EXPAND @var{$1} @var{nonterminal} @var{depth})
A list starting with @code{EXPAND} performs a recursive parse on the
token passed to it (represented by @samp{$1} above.) The
@dfn{semantic list} is a common token to expand, as there are often
interesting things in the list. The @var{nonterminal} is a symbol in
your table which the bovinator will start with when parsing.
@var{nonterminal}'s definition is the same as any other nonterminal.
@var{depth} should be at least @samp{1} when descending into a
semantic list.
@item (EXPANDFULL @var{$1} @var{nonterminal} @var{depth})
Is like @code{EXPAND}, except that the parser will iterate over
@var{nonterminal} until there are no more matches. (The same way the
parser iterates over the starting rule (@pxref{Starting Rules}). This
lets you have much simpler rules in this specific case, and also lets
you have positional information in the returned tokens, and error
skipping.
@item (ASSOC @var{symbol1} @var{value1} @var{symbol2} @var{value2} @dots{})
This is used for creating an association list. Each @var{symbol} is
included in the list if the associated @var{value} is non-@code{nil}.
While the items are all listed explicitly, the created structure is an
association list of the form:
@example
((@var{symbol1} . @var{value1}) (@var{symbol2} . @var{value2}) @dots{})
@end example
@item (TAG @var{name} @var{class} [@var{attributes}])
This creates one tag in the current buffer.
@table @var
@item name
Is a string that represents the tag in the language.
@item class
Is the kind of tag being create, such as @code{function}, or
@code{variable}, though any symbol will work.
@item attributes
Is an optional set of labeled values such as @code{:constant-flag t :parent
"parenttype"}.
@end table
@item (TAG-VARIABLE @var{name} @var{type} @var{default-value} [@var{attributes}])
@itemx (TAG-FUNCTION @var{name} @var{type} @var{arg-list} [@var{attributes}])
@itemx (TAG-TYPE @var{name} @var{type} @var{members} @var{parents} [@var{attributes}])
@itemx (TAG-INCLUDE @var{name} @var{system-flag} [@var{attributes}])
@itemx (TAG-PACKAGE @var{name} @var{detail} [@var{attributes}])
@itemx (TAG-CODE @var{name} @var{detail} [@var{attributes}])
Create a tag with @var{name} of respectively the class
@code{variable}, @code{function}, @code{type}, @code{include},
@code{package}, and @code{code}.
See @inforef{Creating Tags, , semantic-appdev} for the lisp
functions these translate into.
@end table
If the symbol @code{%quotemode backquote} is specified, then use
@code{,@@} to splice a list in, and @code{,} to evaluate the expression.
This lets you send @code{$1} as a symbol into a list instead of having
it expanded inline.
@node Bovine Examples
@chapter Examples
The rule:
@example
any-symbol: symbol
;
@end example
is equivalent to
@example
any-symbol: symbol
( $1 )
;
@end example
which, if it matched the string @samp{"A"}, would return
@example
( "A" )
@end example
If this rule were used like this:
@example
%token <punctuation> EQUAL "="
@dots{}
assign: any-symbol EQUAL any-symbol
( $1 $3 )
;
@end example
it would match @samp{"A=B"}, and return
@example
( ("A") ("B") )
@end example
The letters @samp{A} and @samp{B} come back in lists because
@samp{any-symbol} is a nonterminal, not an actual lexical element.
To get a better result with nonterminals, use @asis{,} to splice lists
in like this:
@example
%token <punctuation> EQUAL "="
@dots{}
assign: any-symbol EQUAL any-symbol
( ,$1 ,$3 )
;
@end example
which would return
@example
( "A" "B" )
@end example
@node GNU Free Documentation License
@appendix GNU Free Documentation License
@include doclicense.texi
@c There is nothing to index at the moment.
@ignore
@node Index
@unnumbered Index
@printindex cp
@end ignore
@iftex
@contents
@summarycontents
@end iftex
@bye
@c Following comments are for the benefit of ispell.
@c LocalWords: bovinator inlined