12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166 |
- This is Info file bison.info, produced by Makeinfo-1.55 from the input
- file ./bison.texinfo.
- This file documents the Bison parser generator.
- Copyright (C) 1988, 89, 90, 91, 92, 93, 1995 Free Software
- Foundation, Inc.
- Permission is granted to make and distribute verbatim copies of this
- manual provided the copyright notice and this permission notice are
- preserved on all copies.
- Permission is granted to copy and distribute modified versions of
- this manual under the conditions for verbatim copying, provided also
- that the sections entitled "GNU General Public License" and "Conditions
- for Using Bison" are included exactly as in the original, and provided
- that the entire resulting derived work is distributed under the terms
- of a permission notice identical to this one.
- Permission is granted to copy and distribute translations of this
- manual into another language, under the above conditions for modified
- versions, except that the sections entitled "GNU General Public
- License", "Conditions for Using Bison" and this permission notice may be
- included in translations approved by the Free Software Foundation
- instead of in the original English.
- File: bison.info, Node: Reduce/Reduce, Next: Mystery Conflicts, Prev: Parser States, Up: Algorithm
- Reduce/Reduce Conflicts
- =======================
- A reduce/reduce conflict occurs if there are two or more rules that
- apply to the same sequence of input. This usually indicates a serious
- error in the grammar.
- For example, here is an erroneous attempt to define a sequence of
- zero or more `word' groupings.
- sequence: /* empty */
- { printf ("empty sequence\n"); }
- | maybeword
- | sequence word
- { printf ("added word %s\n", $2); }
- ;
-
- maybeword: /* empty */
- { printf ("empty maybeword\n"); }
- | word
- { printf ("single word %s\n", $1); }
- ;
- The error is an ambiguity: there is more than one way to parse a single
- `word' into a `sequence'. It could be reduced to a `maybeword' and
- then into a `sequence' via the second rule. Alternatively,
- nothing-at-all could be reduced into a `sequence' via the first rule,
- and this could be combined with the `word' using the third rule for
- `sequence'.
- There is also more than one way to reduce nothing-at-all into a
- `sequence'. This can be done directly via the first rule, or
- indirectly via `maybeword' and then the second rule.
- You might think that this is a distinction without a difference,
- because it does not change whether any particular input is valid or
- not. But it does affect which actions are run. One parsing order runs
- the second rule's action; the other runs the first rule's action and
- the third rule's action. In this example, the output of the program
- changes.
- Bison resolves a reduce/reduce conflict by choosing to use the rule
- that appears first in the grammar, but it is very risky to rely on
- this. Every reduce/reduce conflict must be studied and usually
- eliminated. Here is the proper way to define `sequence':
- sequence: /* empty */
- { printf ("empty sequence\n"); }
- | sequence word
- { printf ("added word %s\n", $2); }
- ;
- Here is another common error that yields a reduce/reduce conflict:
- sequence: /* empty */
- | sequence words
- | sequence redirects
- ;
-
- words: /* empty */
- | words word
- ;
-
- redirects:/* empty */
- | redirects redirect
- ;
- The intention here is to define a sequence which can contain either
- `word' or `redirect' groupings. The individual definitions of
- `sequence', `words' and `redirects' are error-free, but the three
- together make a subtle ambiguity: even an empty input can be parsed in
- infinitely many ways!
- Consider: nothing-at-all could be a `words'. Or it could be two
- `words' in a row, or three, or any number. It could equally well be a
- `redirects', or two, or any number. Or it could be a `words' followed
- by three `redirects' and another `words'. And so on.
- Here are two ways to correct these rules. First, to make it a
- single level of sequence:
- sequence: /* empty */
- | sequence word
- | sequence redirect
- ;
- Second, to prevent either a `words' or a `redirects' from being
- empty:
- sequence: /* empty */
- | sequence words
- | sequence redirects
- ;
-
- words: word
- | words word
- ;
-
- redirects:redirect
- | redirects redirect
- ;
- File: bison.info, Node: Mystery Conflicts, Next: Stack Overflow, Prev: Reduce/Reduce, Up: Algorithm
- Mysterious Reduce/Reduce Conflicts
- ==================================
- Sometimes reduce/reduce conflicts can occur that don't look
- warranted. Here is an example:
- %token ID
-
- %%
- def: param_spec return_spec ','
- ;
- param_spec:
- type
- | name_list ':' type
- ;
- return_spec:
- type
- | name ':' type
- ;
- type: ID
- ;
- name: ID
- ;
- name_list:
- name
- | name ',' name_list
- ;
- It would seem that this grammar can be parsed with only a single
- token of look-ahead: when a `param_spec' is being read, an `ID' is a
- `name' if a comma or colon follows, or a `type' if another `ID'
- follows. In other words, this grammar is LR(1).
- However, Bison, like most parser generators, cannot actually handle
- all LR(1) grammars. In this grammar, two contexts, that after an `ID'
- at the beginning of a `param_spec' and likewise at the beginning of a
- `return_spec', are similar enough that Bison assumes they are the same.
- They appear similar because the same set of rules would be active--the
- rule for reducing to a `name' and that for reducing to a `type'. Bison
- is unable to determine at that stage of processing that the rules would
- require different look-ahead tokens in the two contexts, so it makes a
- single parser state for them both. Combining the two contexts causes a
- conflict later. In parser terminology, this occurrence means that the
- grammar is not LALR(1).
- In general, it is better to fix deficiencies than to document them.
- But this particular deficiency is intrinsically hard to fix; parser
- generators that can handle LR(1) grammars are hard to write and tend to
- produce parsers that are very large. In practice, Bison is more useful
- as it is now.
- When the problem arises, you can often fix it by identifying the two
- parser states that are being confused, and adding something to make them
- look distinct. In the above example, adding one rule to `return_spec'
- as follows makes the problem go away:
- %token BOGUS
- ...
- %%
- ...
- return_spec:
- type
- | name ':' type
- /* This rule is never used. */
- | ID BOGUS
- ;
- This corrects the problem because it introduces the possibility of an
- additional active rule in the context after the `ID' at the beginning of
- `return_spec'. This rule is not active in the corresponding context in
- a `param_spec', so the two contexts receive distinct parser states. As
- long as the token `BOGUS' is never generated by `yylex', the added rule
- cannot alter the way actual input is parsed.
- In this particular example, there is another way to solve the
- problem: rewrite the rule for `return_spec' to use `ID' directly
- instead of via `name'. This also causes the two confusing contexts to
- have different sets of active rules, because the one for `return_spec'
- activates the altered rule for `return_spec' rather than the one for
- `name'.
- param_spec:
- type
- | name_list ':' type
- ;
- return_spec:
- type
- | ID ':' type
- ;
- File: bison.info, Node: Stack Overflow, Prev: Mystery Conflicts, Up: Algorithm
- Stack Overflow, and How to Avoid It
- ===================================
- The Bison parser stack can overflow if too many tokens are shifted
- and not reduced. When this happens, the parser function `yyparse'
- returns a nonzero value, pausing only to call `yyerror' to report the
- overflow.
- By defining the macro `YYMAXDEPTH', you can control how deep the
- parser stack can become before a stack overflow occurs. Define the
- macro with a value that is an integer. This value is the maximum number
- of tokens that can be shifted (and not reduced) before overflow. It
- must be a constant expression whose value is known at compile time.
- The stack space allowed is not necessarily allocated. If you
- specify a large value for `YYMAXDEPTH', the parser actually allocates a
- small stack at first, and then makes it bigger by stages as needed.
- This increasing allocation happens automatically and silently.
- Therefore, you do not need to make `YYMAXDEPTH' painfully small merely
- to save space for ordinary inputs that do not need much stack.
- The default value of `YYMAXDEPTH', if you do not define it, is 10000.
- You can control how much stack is allocated initially by defining the
- macro `YYINITDEPTH'. This value too must be a compile-time constant
- integer. The default is 200.
- File: bison.info, Node: Error Recovery, Next: Context Dependency, Prev: Algorithm, Up: Top
- Error Recovery
- **************
- It is not usually acceptable to have a program terminate on a parse
- error. For example, a compiler should recover sufficiently to parse the
- rest of the input file and check it for errors; a calculator should
- accept another expression.
- In a simple interactive command parser where each input is one line,
- it may be sufficient to allow `yyparse' to return 1 on error and have
- the caller ignore the rest of the input line when that happens (and
- then call `yyparse' again). But this is inadequate for a compiler,
- because it forgets all the syntactic context leading up to the error.
- A syntax error deep within a function in the compiler input should not
- cause the compiler to treat the following line like the beginning of a
- source file.
- You can define how to recover from a syntax error by writing rules to
- recognize the special token `error'. This is a terminal symbol that is
- always defined (you need not declare it) and reserved for error
- handling. The Bison parser generates an `error' token whenever a
- syntax error happens; if you have provided a rule to recognize this
- token in the current context, the parse can continue.
- For example:
- stmnts: /* empty string */
- | stmnts '\n'
- | stmnts exp '\n'
- | stmnts error '\n'
- The fourth rule in this example says that an error followed by a
- newline makes a valid addition to any `stmnts'.
- What happens if a syntax error occurs in the middle of an `exp'? The
- error recovery rule, interpreted strictly, applies to the precise
- sequence of a `stmnts', an `error' and a newline. If an error occurs in
- the middle of an `exp', there will probably be some additional tokens
- and subexpressions on the stack after the last `stmnts', and there will
- be tokens to read before the next newline. So the rule is not
- applicable in the ordinary way.
- But Bison can force the situation to fit the rule, by discarding
- part of the semantic context and part of the input. First it discards
- states and objects from the stack until it gets back to a state in
- which the `error' token is acceptable. (This means that the
- subexpressions already parsed are discarded, back to the last complete
- `stmnts'.) At this point the `error' token can be shifted. Then, if
- the old look-ahead token is not acceptable to be shifted next, the
- parser reads tokens and discards them until it finds a token which is
- acceptable. In this example, Bison reads and discards input until the
- next newline so that the fourth rule can apply.
- The choice of error rules in the grammar is a choice of strategies
- for error recovery. A simple and useful strategy is simply to skip the
- rest of the current input line or current statement if an error is
- detected:
- stmnt: error ';' /* on error, skip until ';' is read */
- It is also useful to recover to the matching close-delimiter of an
- opening-delimiter that has already been parsed. Otherwise the
- close-delimiter will probably appear to be unmatched, and generate
- another, spurious error message:
- primary: '(' expr ')'
- | '(' error ')'
- ...
- ;
- Error recovery strategies are necessarily guesses. When they guess
- wrong, one syntax error often leads to another. In the above example,
- the error recovery rule guesses that an error is due to bad input
- within one `stmnt'. Suppose that instead a spurious semicolon is
- inserted in the middle of a valid `stmnt'. After the error recovery
- rule recovers from the first error, another syntax error will be found
- straightaway, since the text following the spurious semicolon is also
- an invalid `stmnt'.
- To prevent an outpouring of error messages, the parser will output
- no error message for another syntax error that happens shortly after
- the first; only after three consecutive input tokens have been
- successfully shifted will error messages resume.
- Note that rules which accept the `error' token may have actions, just
- as any other rules can.
- You can make error messages resume immediately by using the macro
- `yyerrok' in an action. If you do this in the error rule's action, no
- error messages will be suppressed. This macro requires no arguments;
- `yyerrok;' is a valid C statement.
- The previous look-ahead token is reanalyzed immediately after an
- error. If this is unacceptable, then the macro `yyclearin' may be used
- to clear this token. Write the statement `yyclearin;' in the error
- rule's action.
- For example, suppose that on a parse error, an error handling
- routine is called that advances the input stream to some point where
- parsing should once again commence. The next symbol returned by the
- lexical scanner is probably correct. The previous look-ahead token
- ought to be discarded with `yyclearin;'.
- The macro `YYRECOVERING' stands for an expression that has the value
- 1 when the parser is recovering from a syntax error, and 0 the rest of
- the time. A value of 1 indicates that error messages are currently
- suppressed for new syntax errors.
- File: bison.info, Node: Context Dependency, Next: Debugging, Prev: Error Recovery, Up: Top
- Handling Context Dependencies
- *****************************
- The Bison paradigm is to parse tokens first, then group them into
- larger syntactic units. In many languages, the meaning of a token is
- affected by its context. Although this violates the Bison paradigm,
- certain techniques (known as "kludges") may enable you to write Bison
- parsers for such languages.
- * Menu:
- * Semantic Tokens:: Token parsing can depend on the semantic context.
- * Lexical Tie-ins:: Token parsing can depend on the syntactic context.
- * Tie-in Recovery:: Lexical tie-ins have implications for how
- error recovery rules must be written.
- (Actually, "kludge" means any technique that gets its job done but is
- neither clean nor robust.)
- File: bison.info, Node: Semantic Tokens, Next: Lexical Tie-ins, Up: Context Dependency
- Semantic Info in Token Types
- ============================
- The C language has a context dependency: the way an identifier is
- used depends on what its current meaning is. For example, consider
- this:
- foo (x);
- This looks like a function call statement, but if `foo' is a typedef
- name, then this is actually a declaration of `x'. How can a Bison
- parser for C decide how to parse this input?
- The method used in GNU C is to have two different token types,
- `IDENTIFIER' and `TYPENAME'. When `yylex' finds an identifier, it
- looks up the current declaration of the identifier in order to decide
- which token type to return: `TYPENAME' if the identifier is declared as
- a typedef, `IDENTIFIER' otherwise.
- The grammar rules can then express the context dependency by the
- choice of token type to recognize. `IDENTIFIER' is accepted as an
- expression, but `TYPENAME' is not. `TYPENAME' can start a declaration,
- but `IDENTIFIER' cannot. In contexts where the meaning of the
- identifier is *not* significant, such as in declarations that can
- shadow a typedef name, either `TYPENAME' or `IDENTIFIER' is
- accepted--there is one rule for each of the two token types.
- This technique is simple to use if the decision of which kinds of
- identifiers to allow is made at a place close to where the identifier is
- parsed. But in C this is not always so: C allows a declaration to
- redeclare a typedef name provided an explicit type has been specified
- earlier:
- typedef int foo, bar, lose;
- static foo (bar); /* redeclare `bar' as static variable */
- static int foo (lose); /* redeclare `foo' as function */
- Unfortunately, the name being declared is separated from the
- declaration construct itself by a complicated syntactic structure--the
- "declarator".
- As a result, the part of Bison parser for C needs to be duplicated,
- with all the nonterminal names changed: once for parsing a declaration
- in which a typedef name can be redefined, and once for parsing a
- declaration in which that can't be done. Here is a part of the
- duplication, with actions omitted for brevity:
- initdcl:
- declarator maybeasm '='
- init
- | declarator maybeasm
- ;
-
- notype_initdcl:
- notype_declarator maybeasm '='
- init
- | notype_declarator maybeasm
- ;
- Here `initdcl' can redeclare a typedef name, but `notype_initdcl'
- cannot. The distinction between `declarator' and `notype_declarator'
- is the same sort of thing.
- There is some similarity between this technique and a lexical tie-in
- (described next), in that information which alters the lexical analysis
- is changed during parsing by other parts of the program. The
- difference is here the information is global, and is used for other
- purposes in the program. A true lexical tie-in has a special-purpose
- flag controlled by the syntactic context.
- File: bison.info, Node: Lexical Tie-ins, Next: Tie-in Recovery, Prev: Semantic Tokens, Up: Context Dependency
- Lexical Tie-ins
- ===============
- One way to handle context-dependency is the "lexical tie-in": a flag
- which is set by Bison actions, whose purpose is to alter the way tokens
- are parsed.
- For example, suppose we have a language vaguely like C, but with a
- special construct `hex (HEX-EXPR)'. After the keyword `hex' comes an
- expression in parentheses in which all integers are hexadecimal. In
- particular, the token `a1b' must be treated as an integer rather than
- as an identifier if it appears in that context. Here is how you can do
- it:
- %{
- int hexflag;
- %}
- %%
- ...
- expr: IDENTIFIER
- | constant
- | HEX '('
- { hexflag = 1; }
- expr ')'
- { hexflag = 0;
- $$ = $4; }
- | expr '+' expr
- { $$ = make_sum ($1, $3); }
- ...
- ;
-
- constant:
- INTEGER
- | STRING
- ;
- Here we assume that `yylex' looks at the value of `hexflag'; when it is
- nonzero, all integers are parsed in hexadecimal, and tokens starting
- with letters are parsed as integers if possible.
- The declaration of `hexflag' shown in the C declarations section of
- the parser file is needed to make it accessible to the actions (*note
- The C Declarations Section: C Declarations.). You must also write the
- code in `yylex' to obey the flag.
- File: bison.info, Node: Tie-in Recovery, Prev: Lexical Tie-ins, Up: Context Dependency
- Lexical Tie-ins and Error Recovery
- ==================================
- Lexical tie-ins make strict demands on any error recovery rules you
- have. *Note Error Recovery::.
- The reason for this is that the purpose of an error recovery rule is
- to abort the parsing of one construct and resume in some larger
- construct. For example, in C-like languages, a typical error recovery
- rule is to skip tokens until the next semicolon, and then start a new
- statement, like this:
- stmt: expr ';'
- | IF '(' expr ')' stmt { ... }
- ...
- error ';'
- { hexflag = 0; }
- ;
- If there is a syntax error in the middle of a `hex (EXPR)'
- construct, this error rule will apply, and then the action for the
- completed `hex (EXPR)' will never run. So `hexflag' would remain set
- for the entire rest of the input, or until the next `hex' keyword,
- causing identifiers to be misinterpreted as integers.
- To avoid this problem the error recovery rule itself clears
- `hexflag'.
- There may also be an error recovery rule that works within
- expressions. For example, there could be a rule which applies within
- parentheses and skips to the close-parenthesis:
- expr: ...
- | '(' expr ')'
- { $$ = $2; }
- | '(' error ')'
- ...
- If this rule acts within the `hex' construct, it is not going to
- abort that construct (since it applies to an inner level of parentheses
- within the construct). Therefore, it should not clear the flag: the
- rest of the `hex' construct should be parsed with the flag still in
- effect.
- What if there is an error recovery rule which might abort out of the
- `hex' construct or might not, depending on circumstances? There is no
- way you can write the action to determine whether a `hex' construct is
- being aborted or not. So if you are using a lexical tie-in, you had
- better make sure your error recovery rules are not of this kind. Each
- rule must be such that you can be sure that it always will, or always
- won't, have to clear the flag.
- File: bison.info, Node: Debugging, Next: Invocation, Prev: Context Dependency, Up: Top
- Debugging Your Parser
- *********************
- If a Bison grammar compiles properly but doesn't do what you want
- when it runs, the `yydebug' parser-trace feature can help you figure
- out why.
- To enable compilation of trace facilities, you must define the macro
- `YYDEBUG' when you compile the parser. You could use `-DYYDEBUG=1' as
- a compiler option or you could put `#define YYDEBUG 1' in the C
- declarations section of the grammar file (*note The C Declarations
- Section: C Declarations.). Alternatively, use the `-t' option when you
- run Bison (*note Invoking Bison: Invocation.). We always define
- `YYDEBUG' so that debugging is always possible.
- The trace facility uses `stderr', so you must add
- `#include <stdio.h>' to the C declarations section unless it is already
- there.
- Once you have compiled the program with trace facilities, the way to
- request a trace is to store a nonzero value in the variable `yydebug'.
- You can do this by making the C code do it (in `main', perhaps), or you
- can alter the value with a C debugger.
- Each step taken by the parser when `yydebug' is nonzero produces a
- line or two of trace information, written on `stderr'. The trace
- messages tell you these things:
- * Each time the parser calls `yylex', what kind of token was read.
- * Each time a token is shifted, the depth and complete contents of
- the state stack (*note Parser States::.).
- * Each time a rule is reduced, which rule it is, and the complete
- contents of the state stack afterward.
- To make sense of this information, it helps to refer to the listing
- file produced by the Bison `-v' option (*note Invoking Bison:
- Invocation.). This file shows the meaning of each state in terms of
- positions in various rules, and also what each state will do with each
- possible input token. As you read the successive trace messages, you
- can see that the parser is functioning according to its specification
- in the listing file. Eventually you will arrive at the place where
- something undesirable happens, and you will see which parts of the
- grammar are to blame.
- The parser file is a C program and you can use C debuggers on it,
- but it's not easy to interpret what it is doing. The parser function
- is a finite-state machine interpreter, and aside from the actions it
- executes the same code over and over. Only the values of variables
- show where in the grammar it is working.
- The debugging information normally gives the token type of each token
- read, but not its semantic value. You can optionally define a macro
- named `YYPRINT' to provide a way to print the value. If you define
- `YYPRINT', it should take three arguments. The parser will pass a
- standard I/O stream, the numeric code for the token type, and the token
- value (from `yylval').
- Here is an example of `YYPRINT' suitable for the multi-function
- calculator (*note Declarations for `mfcalc': Mfcalc Decl.):
- #define YYPRINT(file, type, value) yyprint (file, type, value)
-
- static void
- yyprint (file, type, value)
- FILE *file;
- int type;
- YYSTYPE value;
- {
- if (type == VAR)
- fprintf (file, " %s", value.tptr->name);
- else if (type == NUM)
- fprintf (file, " %d", value.val);
- }
- File: bison.info, Node: Invocation, Next: Table of Symbols, Prev: Debugging, Up: Top
- Invoking Bison
- **************
- The usual way to invoke Bison is as follows:
- bison INFILE
- Here INFILE is the grammar file name, which usually ends in `.y'.
- The parser file's name is made by replacing the `.y' with `.tab.c'.
- Thus, the `bison foo.y' filename yields `foo.tab.c', and the `bison
- hack/foo.y' filename yields `hack/foo.tab.c'.
- * Menu:
- * Bison Options:: All the options described in detail,
- in alphabetical order by short options.
- * Option Cross Key:: Alphabetical list of long options.
- * VMS Invocation:: Bison command syntax on VMS.
- File: bison.info, Node: Bison Options, Next: Option Cross Key, Up: Invocation
- Bison Options
- =============
- Bison supports both traditional single-letter options and mnemonic
- long option names. Long option names are indicated with `--' instead of
- `-'. Abbreviations for option names are allowed as long as they are
- unique. When a long option takes an argument, like `--file-prefix',
- connect the option name and the argument with `='.
- Here is a list of options that can be used with Bison, alphabetized
- by short option. It is followed by a cross key alphabetized by long
- option.
- `-b FILE-PREFIX'
- `--file-prefix=PREFIX'
- Specify a prefix to use for all Bison output file names. The
- names are chosen as if the input file were named `PREFIX.c'.
- `-d'
- `--defines'
- Write an extra output file containing macro definitions for the
- token type names defined in the grammar and the semantic value type
- `YYSTYPE', as well as a few `extern' variable declarations.
- If the parser output file is named `NAME.c' then this file is
- named `NAME.h'.
- This output file is essential if you wish to put the definition of
- `yylex' in a separate source file, because `yylex' needs to be
- able to refer to token type codes and the variable `yylval'.
- *Note Semantic Values of Tokens: Token Values.
- `-l'
- `--no-lines'
- Don't put any `#line' preprocessor commands in the parser file.
- Ordinarily Bison puts them in the parser file so that the C
- compiler and debuggers will associate errors with your source
- file, the grammar file. This option causes them to associate
- errors with the parser file, treating it an independent source
- file in its own right.
- `-o OUTFILE'
- `--output-file=OUTFILE'
- Specify the name OUTFILE for the parser file.
- The other output files' names are constructed from OUTFILE as
- described under the `-v' and `-d' switches.
- `-p PREFIX'
- `--name-prefix=PREFIX'
- Rename the external symbols used in the parser so that they start
- with PREFIX instead of `yy'. The precise list of symbols renamed
- is `yyparse', `yylex', `yyerror', `yynerrs', `yylval', `yychar'
- and `yydebug'.
- For example, if you use `-p c', the names become `cparse', `clex',
- and so on.
- *Note Multiple Parsers in the Same Program: Multiple Parsers.
- `-t'
- `--debug'
- Output a definition of the macro `YYDEBUG' into the parser file,
- so that the debugging facilities are compiled. *Note Debugging
- Your Parser: Debugging.
- `-v'
- `--verbose'
- Write an extra output file containing verbose descriptions of the
- parser states and what is done for each type of look-ahead token in
- that state.
- This file also describes all the conflicts, both those resolved by
- operator precedence and the unresolved ones.
- The file's name is made by removing `.tab.c' or `.c' from the
- parser output file name, and adding `.output' instead.
- Therefore, if the input file is `foo.y', then the parser file is
- called `foo.tab.c' by default. As a consequence, the verbose
- output file is called `foo.output'.
- `-V'
- `--version'
- Print the version number of Bison and exit.
- `-h'
- `--help'
- Print a summary of the command-line options to Bison and exit.
- `-y'
- `--yacc'
- `--fixed-output-files'
- Equivalent to `-o y.tab.c'; the parser output file is called
- `y.tab.c', and the other outputs are called `y.output' and
- `y.tab.h'. The purpose of this switch is to imitate Yacc's output
- file name conventions. Thus, the following shell script can
- substitute for Yacc:
- bison -y $*
- File: bison.info, Node: Option Cross Key, Next: VMS Invocation, Prev: Bison Options, Up: Invocation
- Option Cross Key
- ================
- Here is a list of options, alphabetized by long option, to help you
- find the corresponding short option.
- --debug -t
- --defines -d
- --file-prefix=PREFIX -b FILE-PREFIX
- --fixed-output-files --yacc -y
- --help -h
- --name-prefix -p
- --no-lines -l
- --output-file=OUTFILE -o OUTFILE
- --verbose -v
- --version -V
- File: bison.info, Node: VMS Invocation, Prev: Option Cross Key, Up: Invocation
- Invoking Bison under VMS
- ========================
- The command line syntax for Bison on VMS is a variant of the usual
- Bison command syntax--adapted to fit VMS conventions.
- To find the VMS equivalent for any Bison option, start with the long
- option, and substitute a `/' for the leading `--', and substitute a `_'
- for each `-' in the name of the long option. For example, the
- following invocation under VMS:
- bison /debug/name_prefix=bar foo.y
- is equivalent to the following command under POSIX.
- bison --debug --name-prefix=bar foo.y
- The VMS file system does not permit filenames such as `foo.tab.c'.
- In the above example, the output file would instead be named
- `foo_tab.c'.
- File: bison.info, Node: Table of Symbols, Next: Glossary, Prev: Invocation, Up: Top
- Bison Symbols
- *************
- `error'
- A token name reserved for error recovery. This token may be used
- in grammar rules so as to allow the Bison parser to recognize an
- error in the grammar without halting the process. In effect, a
- sentence containing an error may be recognized as valid. On a
- parse error, the token `error' becomes the current look-ahead
- token. Actions corresponding to `error' are then executed, and
- the look-ahead token is reset to the token that originally caused
- the violation. *Note Error Recovery::.
- `YYABORT'
- Macro to pretend that an unrecoverable syntax error has occurred,
- by making `yyparse' return 1 immediately. The error reporting
- function `yyerror' is not called. *Note The Parser Function
- `yyparse': Parser Function.
- `YYACCEPT'
- Macro to pretend that a complete utterance of the language has been
- read, by making `yyparse' return 0 immediately. *Note The Parser
- Function `yyparse': Parser Function.
- `YYBACKUP'
- Macro to discard a value from the parser stack and fake a
- look-ahead token. *Note Special Features for Use in Actions:
- Action Features.
- `YYERROR'
- Macro to pretend that a syntax error has just been detected: call
- `yyerror' and then perform normal error recovery if possible
- (*note Error Recovery::.), or (if recovery is impossible) make
- `yyparse' return 1. *Note Error Recovery::.
- `YYERROR_VERBOSE'
- Macro that you define with `#define' in the Bison declarations
- section to request verbose, specific error message strings when
- `yyerror' is called.
- `YYINITDEPTH'
- Macro for specifying the initial size of the parser stack. *Note
- Stack Overflow::.
- `YYLEX_PARAM'
- Macro for specifying an extra argument (or list of extra
- arguments) for `yyparse' to pass to `yylex'. *Note Calling
- Conventions for Pure Parsers: Pure Calling.
- `YYLTYPE'
- Macro for the data type of `yylloc'; a structure with four
- members. *Note Textual Positions of Tokens: Token Positions.
- `YYMAXDEPTH'
- Macro for specifying the maximum size of the parser stack. *Note
- Stack Overflow::.
- `YYPARSE_PARAM'
- Macro for specifying the name of a parameter that `yyparse' should
- accept. *Note Calling Conventions for Pure Parsers: Pure Calling.
- `YYRECOVERING'
- Macro whose value indicates whether the parser is recovering from a
- syntax error. *Note Special Features for Use in Actions: Action
- Features.
- `YYSTYPE'
- Macro for the data type of semantic values; `int' by default.
- *Note Data Types of Semantic Values: Value Type.
- `yychar'
- External integer variable that contains the integer value of the
- current look-ahead token. (In a pure parser, it is a local
- variable within `yyparse'.) Error-recovery rule actions may
- examine this variable. *Note Special Features for Use in Actions:
- Action Features.
- `yyclearin'
- Macro used in error-recovery rule actions. It clears the previous
- look-ahead token. *Note Error Recovery::.
- `yydebug'
- External integer variable set to zero by default. If `yydebug' is
- given a nonzero value, the parser will output information on input
- symbols and parser action. *Note Debugging Your Parser: Debugging.
- `yyerrok'
- Macro to cause parser to recover immediately to its normal mode
- after a parse error. *Note Error Recovery::.
- `yyerror'
- User-supplied function to be called by `yyparse' on error. The
- function receives one argument, a pointer to a character string
- containing an error message. *Note The Error Reporting Function
- `yyerror': Error Reporting.
- `yylex'
- User-supplied lexical analyzer function, called with no arguments
- to get the next token. *Note The Lexical Analyzer Function
- `yylex': Lexical.
- `yylval'
- External variable in which `yylex' should place the semantic value
- associated with a token. (In a pure parser, it is a local
- variable within `yyparse', and its address is passed to `yylex'.)
- *Note Semantic Values of Tokens: Token Values.
- `yylloc'
- External variable in which `yylex' should place the line and
- column numbers associated with a token. (In a pure parser, it is a
- local variable within `yyparse', and its address is passed to
- `yylex'.) You can ignore this variable if you don't use the `@'
- feature in the grammar actions. *Note Textual Positions of
- Tokens: Token Positions.
- `yynerrs'
- Global variable which Bison increments each time there is a parse
- error. (In a pure parser, it is a local variable within
- `yyparse'.) *Note The Error Reporting Function `yyerror': Error
- Reporting.
- `yyparse'
- The parser function produced by Bison; call this function to start
- parsing. *Note The Parser Function `yyparse': Parser Function.
- `%left'
- Bison declaration to assign left associativity to token(s). *Note
- Operator Precedence: Precedence Decl.
- `%nonassoc'
- Bison declaration to assign nonassociativity to token(s). *Note
- Operator Precedence: Precedence Decl.
- `%prec'
- Bison declaration to assign a precedence to a specific rule.
- *Note Context-Dependent Precedence: Contextual Precedence.
- `%pure_parser'
- Bison declaration to request a pure (reentrant) parser. *Note A
- Pure (Reentrant) Parser: Pure Decl.
- `%right'
- Bison declaration to assign right associativity to token(s).
- *Note Operator Precedence: Precedence Decl.
- `%start'
- Bison declaration to specify the start symbol. *Note The
- Start-Symbol: Start Decl.
- `%token'
- Bison declaration to declare token(s) without specifying
- precedence. *Note Token Type Names: Token Decl.
- `%type'
- Bison declaration to declare nonterminals. *Note Nonterminal
- Symbols: Type Decl.
- `%union'
- Bison declaration to specify several possible data types for
- semantic values. *Note The Collection of Value Types: Union Decl.
- These are the punctuation and delimiters used in Bison input:
- `%%'
- Delimiter used to separate the grammar rule section from the Bison
- declarations section or the additional C code section. *Note The
- Overall Layout of a Bison Grammar: Grammar Layout.
- `%{ %}'
- All code listed between `%{' and `%}' is copied directly to the
- output file uninterpreted. Such code forms the "C declarations"
- section of the input file. *Note Outline of a Bison Grammar:
- Grammar Outline.
- `/*...*/'
- Comment delimiters, as in C.
- `:'
- Separates a rule's result from its components. *Note Syntax of
- Grammar Rules: Rules.
- `;'
- Terminates a rule. *Note Syntax of Grammar Rules: Rules.
- `|'
- Separates alternate rules for the same result nonterminal. *Note
- Syntax of Grammar Rules: Rules.
- File: bison.info, Node: Glossary, Next: Index, Prev: Table of Symbols, Up: Top
- Glossary
- ********
- Backus-Naur Form (BNF)
- Formal method of specifying context-free grammars. BNF was first
- used in the `ALGOL-60' report, 1963. *Note Languages and
- Context-Free Grammars: Language and Grammar.
- Context-free grammars
- Grammars specified as rules that can be applied regardless of
- context. Thus, if there is a rule which says that an integer can
- be used as an expression, integers are allowed *anywhere* an
- expression is permitted. *Note Languages and Context-Free
- Grammars: Language and Grammar.
- Dynamic allocation
- Allocation of memory that occurs during execution, rather than at
- compile time or on entry to a function.
- Empty string
- Analogous to the empty set in set theory, the empty string is a
- character string of length zero.
- Finite-state stack machine
- A "machine" that has discrete states in which it is said to exist
- at each instant in time. As input to the machine is processed, the
- machine moves from state to state as specified by the logic of the
- machine. In the case of the parser, the input is the language
- being parsed, and the states correspond to various stages in the
- grammar rules. *Note The Bison Parser Algorithm: Algorithm.
- Grouping
- A language construct that is (in general) grammatically divisible;
- for example, `expression' or `declaration' in C. *Note Languages
- and Context-Free Grammars: Language and Grammar.
- Infix operator
- An arithmetic operator that is placed between the operands on
- which it performs some operation.
- Input stream
- A continuous flow of data between devices or programs.
- Language construct
- One of the typical usage schemas of the language. For example,
- one of the constructs of the C language is the `if' statement.
- *Note Languages and Context-Free Grammars: Language and Grammar.
- Left associativity
- Operators having left associativity are analyzed from left to
- right: `a+b+c' first computes `a+b' and then combines with `c'.
- *Note Operator Precedence: Precedence.
- Left recursion
- A rule whose result symbol is also its first component symbol; for
- example, `expseq1 : expseq1 ',' exp;'. *Note Recursive Rules:
- Recursion.
- Left-to-right parsing
- Parsing a sentence of a language by analyzing it token by token
- from left to right. *Note The Bison Parser Algorithm: Algorithm.
- Lexical analyzer (scanner)
- A function that reads an input stream and returns tokens one by
- one. *Note The Lexical Analyzer Function `yylex': Lexical.
- Lexical tie-in
- A flag, set by actions in the grammar rules, which alters the way
- tokens are parsed. *Note Lexical Tie-ins::.
- Look-ahead token
- A token already read but not yet shifted. *Note Look-Ahead
- Tokens: Look-Ahead.
- LALR(1)
- The class of context-free grammars that Bison (like most other
- parser generators) can handle; a subset of LR(1). *Note
- Mysterious Reduce/Reduce Conflicts: Mystery Conflicts.
- LR(1)
- The class of context-free grammars in which at most one token of
- look-ahead is needed to disambiguate the parsing of any piece of
- input.
- Nonterminal symbol
- A grammar symbol standing for a grammatical construct that can be
- expressed through rules in terms of smaller constructs; in other
- words, a construct that is not a token. *Note Symbols::.
- Parse error
- An error encountered during parsing of an input stream due to
- invalid syntax. *Note Error Recovery::.
- Parser
- A function that recognizes valid sentences of a language by
- analyzing the syntax structure of a set of tokens passed to it
- from a lexical analyzer.
- Postfix operator
- An arithmetic operator that is placed after the operands upon
- which it performs some operation.
- Reduction
- Replacing a string of nonterminals and/or terminals with a single
- nonterminal, according to a grammar rule. *Note The Bison Parser
- Algorithm: Algorithm.
- Reentrant
- A reentrant subprogram is a subprogram which can be in invoked any
- number of times in parallel, without interference between the
- various invocations. *Note A Pure (Reentrant) Parser: Pure Decl.
- Reverse polish notation
- A language in which all operators are postfix operators.
- Right recursion
- A rule whose result symbol is also its last component symbol; for
- example, `expseq1: exp ',' expseq1;'. *Note Recursive Rules:
- Recursion.
- Semantics
- In computer languages, the semantics are specified by the actions
- taken for each instance of the language, i.e., the meaning of each
- statement. *Note Defining Language Semantics: Semantics.
- Shift
- A parser is said to shift when it makes the choice of analyzing
- further input from the stream rather than reducing immediately some
- already-recognized rule. *Note The Bison Parser Algorithm:
- Algorithm.
- Single-character literal
- A single character that is recognized and interpreted as is.
- *Note From Formal Rules to Bison Input: Grammar in Bison.
- Start symbol
- The nonterminal symbol that stands for a complete valid utterance
- in the language being parsed. The start symbol is usually listed
- as the first nonterminal symbol in a language specification.
- *Note The Start-Symbol: Start Decl.
- Symbol table
- A data structure where symbol names and associated data are stored
- during parsing to allow for recognition and use of existing
- information in repeated uses of a symbol. *Note Multi-function
- Calc::.
- Token
- A basic, grammatically indivisible unit of a language. The symbol
- that describes a token in the grammar is a terminal symbol. The
- input of the Bison parser is a stream of tokens which comes from
- the lexical analyzer. *Note Symbols::.
- Terminal symbol
- A grammar symbol that has no rules in the grammar and therefore is
- grammatically indivisible. The piece of text it represents is a
- token. *Note Languages and Context-Free Grammars: Language and
- Grammar.
|