123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062 |
- This is Info file bison.info, produced by Makeinfo-1.55 from the input
- file ./bison.texinfo.
- This file documents the Bison parser generator.
- Copyright (C) 1988, 89, 90, 91, 92, 93, 1995 Free Software
- Foundation, Inc.
- Permission is granted to make and distribute verbatim copies of this
- manual provided the copyright notice and this permission notice are
- preserved on all copies.
- Permission is granted to copy and distribute modified versions of
- this manual under the conditions for verbatim copying, provided also
- that the sections entitled "GNU General Public License" and "Conditions
- for Using Bison" are included exactly as in the original, and provided
- that the entire resulting derived work is distributed under the terms
- of a permission notice identical to this one.
- Permission is granted to copy and distribute translations of this
- manual into another language, under the above conditions for modified
- versions, except that the sections entitled "GNU General Public
- License", "Conditions for Using Bison" and this permission notice may be
- included in translations approved by the Free Software Foundation
- instead of in the original English.
- File: bison.info, Node: Top, Next: Introduction, Prev: (dir), Up: (dir)
- This manual documents version 1.24 of Bison.
- * Menu:
- * Introduction::
- * Conditions::
- * Copying:: The GNU General Public License says
- how you can copy and share Bison
- Tutorial sections:
- * Concepts:: Basic concepts for understanding Bison.
- * Examples:: Three simple explained examples of using Bison.
- Reference sections:
- * Grammar File:: Writing Bison declarations and rules.
- * Interface:: C-language interface to the parser function `yyparse'.
- * Algorithm:: How the Bison parser works at run-time.
- * Error Recovery:: Writing rules for error recovery.
- * Context Dependency:: What to do if your language syntax is too
- messy for Bison to handle straightforwardly.
- * Debugging:: Debugging Bison parsers that parse wrong.
- * Invocation:: How to run Bison (to produce the parser source file).
- * Table of Symbols:: All the keywords of the Bison language are explained.
- * Glossary:: Basic concepts are explained.
- * Index:: Cross-references to the text.
- -- The Detailed Node Listing --
- The Concepts of Bison
- * Language and Grammar:: Languages and context-free grammars,
- as mathematical ideas.
- * Grammar in Bison:: How we represent grammars for Bison's sake.
- * Semantic Values:: Each token or syntactic grouping can have
- a semantic value (the value of an integer,
- the name of an identifier, etc.).
- * Semantic Actions:: Each rule can have an action containing C code.
- * Bison Parser:: What are Bison's input and output,
- how is the output used?
- * Stages:: Stages in writing and running Bison grammars.
- * Grammar Layout:: Overall structure of a Bison grammar file.
- Examples
- * RPN Calc:: Reverse polish notation calculator;
- a first example with no operator precedence.
- * Infix Calc:: Infix (algebraic) notation calculator.
- Operator precedence is introduced.
- * Simple Error Recovery:: Continuing after syntax errors.
- * Multi-function Calc:: Calculator with memory and trig functions.
- It uses multiple data-types for semantic values.
- * Exercises:: Ideas for improving the multi-function calculator.
- Reverse Polish Notation Calculator
- * Decls: Rpcalc Decls. Bison and C declarations for rpcalc.
- * Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation.
- * Lexer: Rpcalc Lexer. The lexical analyzer.
- * Main: Rpcalc Main. The controlling function.
- * Error: Rpcalc Error. The error reporting function.
- * Gen: Rpcalc Gen. Running Bison on the grammar file.
- * Comp: Rpcalc Compile. Run the C compiler on the output code.
- Grammar Rules for `rpcalc'
- * Rpcalc Input::
- * Rpcalc Line::
- * Rpcalc Expr::
- Multi-Function Calculator: `mfcalc'
- * Decl: Mfcalc Decl. Bison declarations for multi-function calculator.
- * Rules: Mfcalc Rules. Grammar rules for the calculator.
- * Symtab: Mfcalc Symtab. Symbol table management subroutines.
- Bison Grammar Files
- * Grammar Outline:: Overall layout of the grammar file.
- * Symbols:: Terminal and nonterminal symbols.
- * Rules:: How to write grammar rules.
- * Recursion:: Writing recursive rules.
- * Semantics:: Semantic values and actions.
- * Declarations:: All kinds of Bison declarations are described here.
- * Multiple Parsers:: Putting more than one Bison parser in one program.
- Outline of a Bison Grammar
- * C Declarations:: Syntax and usage of the C declarations section.
- * Bison Declarations:: Syntax and usage of the Bison declarations section.
- * Grammar Rules:: Syntax and usage of the grammar rules section.
- * C Code:: Syntax and usage of the additional C code section.
- Defining Language Semantics
- * Value Type:: Specifying one data type for all semantic values.
- * Multiple Types:: Specifying several alternative data types.
- * Actions:: An action is the semantic definition of a grammar rule.
- * Action Types:: Specifying data types for actions to operate on.
- * Mid-Rule Actions:: Most actions go at the end of a rule.
- This says when, why and how to use the exceptional
- action in the middle of a rule.
- Bison Declarations
- * Token Decl:: Declaring terminal symbols.
- * Precedence Decl:: Declaring terminals with precedence and associativity.
- * Union Decl:: Declaring the set of all semantic value types.
- * Type Decl:: Declaring the choice of type for a nonterminal symbol.
- * Expect Decl:: Suppressing warnings about shift/reduce conflicts.
- * Start Decl:: Specifying the start symbol.
- * Pure Decl:: Requesting a reentrant parser.
- * Decl Summary:: Table of all Bison declarations.
- Parser C-Language Interface
- * Parser Function:: How to call `yyparse' and what it returns.
- * Lexical:: You must supply a function `yylex'
- which reads tokens.
- * Error Reporting:: You must supply a function `yyerror'.
- * Action Features:: Special features for use in actions.
- The Lexical Analyzer Function `yylex'
- * Calling Convention:: How `yyparse' calls `yylex'.
- * Token Values:: How `yylex' must return the semantic value
- of the token it has read.
- * Token Positions:: How `yylex' must return the text position
- (line number, etc.) of the token, if the
- actions want that.
- * Pure Calling:: How the calling convention differs
- in a pure parser (*note A Pure (Reentrant) Parser: Pure Decl.).
- The Bison Parser Algorithm
- * Look-Ahead:: Parser looks one token ahead when deciding what to do.
- * Shift/Reduce:: Conflicts: when either shifting or reduction is valid.
- * Precedence:: Operator precedence works by resolving conflicts.
- * Contextual Precedence:: When an operator's precedence depends on context.
- * Parser States:: The parser is a finite-state-machine with stack.
- * Reduce/Reduce:: When two rules are applicable in the same situation.
- * Mystery Conflicts:: Reduce/reduce conflicts that look unjustified.
- * Stack Overflow:: What happens when stack gets full. How to avoid it.
- Operator Precedence
- * Why Precedence:: An example showing why precedence is needed.
- * Using Precedence:: How to specify precedence in Bison grammars.
- * Precedence Examples:: How these features are used in the previous example.
- * How Precedence:: How they work.
- Handling Context Dependencies
- * Semantic Tokens:: Token parsing can depend on the semantic context.
- * Lexical Tie-ins:: Token parsing can depend on the syntactic context.
- * Tie-in Recovery:: Lexical tie-ins have implications for how
- error recovery rules must be written.
- Invoking Bison
- * Bison Options:: All the options described in detail,
- in alphabetical order by short options.
- * Option Cross Key:: Alphabetical list of long options.
- * VMS Invocation:: Bison command syntax on VMS.
- File: bison.info, Node: Introduction, Next: Conditions, Prev: Top, Up: Top
- Introduction
- ************
- "Bison" is a general-purpose parser generator that converts a
- grammar description for an LALR(1) context-free grammar into a C
- program to parse that grammar. Once you are proficient with Bison, you
- may use it to develop a wide range of language parsers, from those used
- in simple desk calculators to complex programming languages.
- Bison is upward compatible with Yacc: all properly-written Yacc
- grammars ought to work with Bison with no change. Anyone familiar with
- Yacc should be able to use Bison with little trouble. You need to be
- fluent in C programming in order to use Bison or to understand this
- manual.
- We begin with tutorial chapters that explain the basic concepts of
- using Bison and show three explained examples, each building on the
- last. If you don't know Bison or Yacc, start by reading these
- chapters. Reference chapters follow which describe specific aspects of
- Bison in detail.
- Bison was written primarily by Robert Corbett; Richard Stallman made
- it Yacc-compatible. This edition corresponds to version 1.24 of Bison.
- File: bison.info, Node: Conditions, Next: Copying, Prev: Introduction, Up: Top
- Conditions for Using Bison
- **************************
- As of Bison version 1.24, we have changed the distribution terms for
- `yyparse' to permit using Bison's output in non-free programs.
- Formerly, Bison parsers could be used only in programs that were free
- software.
- The other GNU programming tools, such as the GNU C compiler, have
- never had such a requirement. They could always be used for non-free
- software. The reason Bison was different was not due to a special
- policy decision; it resulted from applying the usual General Public
- License to all of the Bison source code.
- The output of the Bison utility--the Bison parser file--contains a
- verbatim copy of a sizable piece of Bison, which is the code for the
- `yyparse' function. (The actions from your grammar are inserted into
- this function at one point, but the rest of the function is not
- changed.) When we applied the GPL terms to the code for `yyparse', the
- effect was to restrict the use of Bison output to free software.
- We didn't change the terms because of sympathy for people who want to
- make software proprietary. *Software should be free.* But we
- concluded that limiting Bison's use to free software was doing little to
- encourage people to make other software free. So we decided to make the
- practical conditions for using Bison match the practical conditions for
- using the other GNU tools.
- File: bison.info, Node: Copying, Next: Concepts, Prev: Conditions, Up: Top
- GNU GENERAL PUBLIC LICENSE
- **************************
- Version 2, June 1991
- Copyright (C) 1989, 1991 Free Software Foundation, Inc.
- 675 Mass Ave, Cambridge, MA 02139, USA
-
- Everyone is permitted to copy and distribute verbatim copies
- of this license document, but changing it is not allowed.
- Preamble
- ========
- The licenses for most software are designed to take away your
- freedom to share and change it. By contrast, the GNU General Public
- License is intended to guarantee your freedom to share and change free
- software--to make sure the software is free for all its users. This
- General Public License applies to most of the Free Software
- Foundation's software and to any other program whose authors commit to
- using it. (Some other Free Software Foundation software is covered by
- the GNU Library General Public License instead.) You can apply it to
- your programs, too.
- When we speak of free software, we are referring to freedom, not
- price. Our General Public Licenses are designed to make sure that you
- have the freedom to distribute copies of free software (and charge for
- this service if you wish), that you receive source code or can get it
- if you want it, that you can change the software or use pieces of it in
- new free programs; and that you know you can do these things.
- To protect your rights, we need to make restrictions that forbid
- anyone to deny you these rights or to ask you to surrender the rights.
- These restrictions translate to certain responsibilities for you if you
- distribute copies of the software, or if you modify it.
- For example, if you distribute copies of such a program, whether
- gratis or for a fee, you must give the recipients all the rights that
- you have. You must make sure that they, too, receive or can get the
- source code. And you must show them these terms so they know their
- rights.
- We protect your rights with two steps: (1) copyright the software,
- and (2) offer you this license which gives you legal permission to copy,
- distribute and/or modify the software.
- Also, for each author's protection and ours, we want to make certain
- that everyone understands that there is no warranty for this free
- software. If the software is modified by someone else and passed on, we
- want its recipients to know that what they have is not the original, so
- that any problems introduced by others will not reflect on the original
- authors' reputations.
- Finally, any free program is threatened constantly by software
- patents. We wish to avoid the danger that redistributors of a free
- program will individually obtain patent licenses, in effect making the
- program proprietary. To prevent this, we have made it clear that any
- patent must be licensed for everyone's free use or not licensed at all.
- The precise terms and conditions for copying, distribution and
- modification follow.
- TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
- 0. This License applies to any program or other work which contains a
- notice placed by the copyright holder saying it may be distributed
- under the terms of this General Public License. The "Program",
- below, refers to any such program or work, and a "work based on
- the Program" means either the Program or any derivative work under
- copyright law: that is to say, a work containing the Program or a
- portion of it, either verbatim or with modifications and/or
- translated into another language. (Hereinafter, translation is
- included without limitation in the term "modification".) Each
- licensee is addressed as "you".
- Activities other than copying, distribution and modification are
- not covered by this License; they are outside its scope. The act
- of running the Program is not restricted, and the output from the
- Program is covered only if its contents constitute a work based on
- the Program (independent of having been made by running the
- Program). Whether that is true depends on what the Program does.
- 1. You may copy and distribute verbatim copies of the Program's
- source code as you receive it, in any medium, provided that you
- conspicuously and appropriately publish on each copy an appropriate
- copyright notice and disclaimer of warranty; keep intact all the
- notices that refer to this License and to the absence of any
- warranty; and give any other recipients of the Program a copy of
- this License along with the Program.
- You may charge a fee for the physical act of transferring a copy,
- and you may at your option offer warranty protection in exchange
- for a fee.
- 2. You may modify your copy or copies of the Program or any portion
- of it, thus forming a work based on the Program, and copy and
- distribute such modifications or work under the terms of Section 1
- above, provided that you also meet all of these conditions:
- a. You must cause the modified files to carry prominent notices
- stating that you changed the files and the date of any change.
- b. You must cause any work that you distribute or publish, that
- in whole or in part contains or is derived from the Program
- or any part thereof, to be licensed as a whole at no charge
- to all third parties under the terms of this License.
- c. If the modified program normally reads commands interactively
- when run, you must cause it, when started running for such
- interactive use in the most ordinary way, to print or display
- an announcement including an appropriate copyright notice and
- a notice that there is no warranty (or else, saying that you
- provide a warranty) and that users may redistribute the
- program under these conditions, and telling the user how to
- view a copy of this License. (Exception: if the Program
- itself is interactive but does not normally print such an
- announcement, your work based on the Program is not required
- to print an announcement.)
- These requirements apply to the modified work as a whole. If
- identifiable sections of that work are not derived from the
- Program, and can be reasonably considered independent and separate
- works in themselves, then this License, and its terms, do not
- apply to those sections when you distribute them as separate
- works. But when you distribute the same sections as part of a
- whole which is a work based on the Program, the distribution of
- the whole must be on the terms of this License, whose permissions
- for other licensees extend to the entire whole, and thus to each
- and every part regardless of who wrote it.
- Thus, it is not the intent of this section to claim rights or
- contest your rights to work written entirely by you; rather, the
- intent is to exercise the right to control the distribution of
- derivative or collective works based on the Program.
- In addition, mere aggregation of another work not based on the
- Program with the Program (or with a work based on the Program) on
- a volume of a storage or distribution medium does not bring the
- other work under the scope of this License.
- 3. You may copy and distribute the Program (or a work based on it,
- under Section 2) in object code or executable form under the terms
- of Sections 1 and 2 above provided that you also do one of the
- following:
- a. Accompany it with the complete corresponding machine-readable
- source code, which must be distributed under the terms of
- Sections 1 and 2 above on a medium customarily used for
- software interchange; or,
- b. Accompany it with a written offer, valid for at least three
- years, to give any third party, for a charge no more than your
- cost of physically performing source distribution, a complete
- machine-readable copy of the corresponding source code, to be
- distributed under the terms of Sections 1 and 2 above on a
- medium customarily used for software interchange; or,
- c. Accompany it with the information you received as to the offer
- to distribute corresponding source code. (This alternative is
- allowed only for noncommercial distribution and only if you
- received the program in object code or executable form with
- such an offer, in accord with Subsection b above.)
- The source code for a work means the preferred form of the work for
- making modifications to it. For an executable work, complete
- source code means all the source code for all modules it contains,
- plus any associated interface definition files, plus the scripts
- used to control compilation and installation of the executable.
- However, as a special exception, the source code distributed need
- not include anything that is normally distributed (in either
- source or binary form) with the major components (compiler,
- kernel, and so on) of the operating system on which the executable
- runs, unless that component itself accompanies the executable.
- If distribution of executable or object code is made by offering
- access to copy from a designated place, then offering equivalent
- access to copy the source code from the same place counts as
- distribution of the source code, even though third parties are not
- compelled to copy the source along with the object code.
- 4. You may not copy, modify, sublicense, or distribute the Program
- except as expressly provided under this License. Any attempt
- otherwise to copy, modify, sublicense or distribute the Program is
- void, and will automatically terminate your rights under this
- License. However, parties who have received copies, or rights,
- from you under this License will not have their licenses
- terminated so long as such parties remain in full compliance.
- 5. You are not required to accept this License, since you have not
- signed it. However, nothing else grants you permission to modify
- or distribute the Program or its derivative works. These actions
- are prohibited by law if you do not accept this License.
- Therefore, by modifying or distributing the Program (or any work
- based on the Program), you indicate your acceptance of this
- License to do so, and all its terms and conditions for copying,
- distributing or modifying the Program or works based on it.
- 6. Each time you redistribute the Program (or any work based on the
- Program), the recipient automatically receives a license from the
- original licensor to copy, distribute or modify the Program
- subject to these terms and conditions. You may not impose any
- further restrictions on the recipients' exercise of the rights
- granted herein. You are not responsible for enforcing compliance
- by third parties to this License.
- 7. If, as a consequence of a court judgment or allegation of patent
- infringement or for any other reason (not limited to patent
- issues), conditions are imposed on you (whether by court order,
- agreement or otherwise) that contradict the conditions of this
- License, they do not excuse you from the conditions of this
- License. If you cannot distribute so as to satisfy simultaneously
- your obligations under this License and any other pertinent
- obligations, then as a consequence you may not distribute the
- Program at all. For example, if a patent license would not permit
- royalty-free redistribution of the Program by all those who
- receive copies directly or indirectly through you, then the only
- way you could satisfy both it and this License would be to refrain
- entirely from distribution of the Program.
- If any portion of this section is held invalid or unenforceable
- under any particular circumstance, the balance of the section is
- intended to apply and the section as a whole is intended to apply
- in other circumstances.
- It is not the purpose of this section to induce you to infringe any
- patents or other property right claims or to contest validity of
- any such claims; this section has the sole purpose of protecting
- the integrity of the free software distribution system, which is
- implemented by public license practices. Many people have made
- generous contributions to the wide range of software distributed
- through that system in reliance on consistent application of that
- system; it is up to the author/donor to decide if he or she is
- willing to distribute software through any other system and a
- licensee cannot impose that choice.
- This section is intended to make thoroughly clear what is believed
- to be a consequence of the rest of this License.
- 8. If the distribution and/or use of the Program is restricted in
- certain countries either by patents or by copyrighted interfaces,
- the original copyright holder who places the Program under this
- License may add an explicit geographical distribution limitation
- excluding those countries, so that distribution is permitted only
- in or among countries not thus excluded. In such case, this
- License incorporates the limitation as if written in the body of
- this License.
- 9. The Free Software Foundation may publish revised and/or new
- versions of the General Public License from time to time. Such
- new versions will be similar in spirit to the present version, but
- may differ in detail to address new problems or concerns.
- Each version is given a distinguishing version number. If the
- Program specifies a version number of this License which applies
- to it and "any later version", you have the option of following
- the terms and conditions either of that version or of any later
- version published by the Free Software Foundation. If the Program
- does not specify a version number of this License, you may choose
- any version ever published by the Free Software Foundation.
- 10. If you wish to incorporate parts of the Program into other free
- programs whose distribution conditions are different, write to the
- author to ask for permission. For software which is copyrighted
- by the Free Software Foundation, write to the Free Software
- Foundation; we sometimes make exceptions for this. Our decision
- will be guided by the two goals of preserving the free status of
- all derivatives of our free software and of promoting the sharing
- and reuse of software generally.
- NO WARRANTY
- 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO
- WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE
- LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
- HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT
- WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT
- NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
- FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE
- QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
- PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY
- SERVICING, REPAIR OR CORRECTION.
- 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
- WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY
- MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE
- LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL,
- INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR
- INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
- DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU
- OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY
- OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN
- ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
- END OF TERMS AND CONDITIONS
- How to Apply These Terms to Your New Programs
- =============================================
- If you develop a new program, and you want it to be of the greatest
- possible use to the public, the best way to achieve this is to make it
- free software which everyone can redistribute and change under these
- terms.
- To do so, attach the following notices to the program. It is safest
- to attach them to the start of each source file to most effectively
- convey the exclusion of warranty; and each file should have at least
- the "copyright" line and a pointer to where the full notice is found.
- ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES.
- Copyright (C) 19YY NAME OF AUTHOR
-
- This program is free software; you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation; either version 2 of the License, or
- (at your option) any later version.
-
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
-
- You should have received a copy of the GNU General Public License
- along with this program; if not, write to the Free Software
- Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- Also add information on how to contact you by electronic and paper
- mail.
- If the program is interactive, make it output a short notice like
- this when it starts in an interactive mode:
- Gnomovision version 69, Copyright (C) 19YY NAME OF AUTHOR
- Gnomovision comes with ABSOLUTELY NO WARRANTY; for details
- type `show w'.
- This is free software, and you are welcome to redistribute it
- under certain conditions; type `show c' for details.
- The hypothetical commands `show w' and `show c' should show the
- appropriate parts of the General Public License. Of course, the
- commands you use may be called something other than `show w' and `show
- c'; they could even be mouse-clicks or menu items--whatever suits your
- program.
- You should also get your employer (if you work as a programmer) or
- your school, if any, to sign a "copyright disclaimer" for the program,
- if necessary. Here is a sample; alter the names:
- Yoyodyne, Inc., hereby disclaims all copyright interest in the program
- `Gnomovision' (which makes passes at compilers) written by James Hacker.
-
- SIGNATURE OF TY COON, 1 April 1989
- Ty Coon, President of Vice
- This General Public License does not permit incorporating your
- program into proprietary programs. If your program is a subroutine
- library, you may consider it more useful to permit linking proprietary
- applications with the library. If this is what you want to do, use the
- GNU Library General Public License instead of this License.
- File: bison.info, Node: Concepts, Next: Examples, Prev: Copying, Up: Top
- The Concepts of Bison
- *********************
- This chapter introduces many of the basic concepts without which the
- details of Bison will not make sense. If you do not already know how to
- use Bison or Yacc, we suggest you start by reading this chapter
- carefully.
- * Menu:
- * Language and Grammar:: Languages and context-free grammars,
- as mathematical ideas.
- * Grammar in Bison:: How we represent grammars for Bison's sake.
- * Semantic Values:: Each token or syntactic grouping can have
- a semantic value (the value of an integer,
- the name of an identifier, etc.).
- * Semantic Actions:: Each rule can have an action containing C code.
- * Bison Parser:: What are Bison's input and output,
- how is the output used?
- * Stages:: Stages in writing and running Bison grammars.
- * Grammar Layout:: Overall structure of a Bison grammar file.
- File: bison.info, Node: Language and Grammar, Next: Grammar in Bison, Up: Concepts
- Languages and Context-Free Grammars
- ===================================
- In order for Bison to parse a language, it must be described by a
- "context-free grammar". This means that you specify one or more
- "syntactic groupings" and give rules for constructing them from their
- parts. For example, in the C language, one kind of grouping is called
- an `expression'. One rule for making an expression might be, "An
- expression can be made of a minus sign and another expression".
- Another would be, "An expression can be an integer". As you can see,
- rules are often recursive, but there must be at least one rule which
- leads out of the recursion.
- The most common formal system for presenting such rules for humans
- to read is "Backus-Naur Form" or "BNF", which was developed in order to
- specify the language Algol 60. Any grammar expressed in BNF is a
- context-free grammar. The input to Bison is essentially
- machine-readable BNF.
- Not all context-free languages can be handled by Bison, only those
- that are LALR(1). In brief, this means that it must be possible to
- tell how to parse any portion of an input string with just a single
- token of look-ahead. Strictly speaking, that is a description of an
- LR(1) grammar, and LALR(1) involves additional restrictions that are
- hard to explain simply; but it is rare in actual practice to find an
- LR(1) grammar that fails to be LALR(1). *Note Mysterious Reduce/Reduce
- Conflicts: Mystery Conflicts, for more information on this.
- In the formal grammatical rules for a language, each kind of
- syntactic unit or grouping is named by a "symbol". Those which are
- built by grouping smaller constructs according to grammatical rules are
- called "nonterminal symbols"; those which can't be subdivided are called
- "terminal symbols" or "token types". We call a piece of input
- corresponding to a single terminal symbol a "token", and a piece
- corresponding to a single nonterminal symbol a "grouping".
- We can use the C language as an example of what symbols, terminal and
- nonterminal, mean. The tokens of C are identifiers, constants (numeric
- and string), and the various keywords, arithmetic operators and
- punctuation marks. So the terminal symbols of a grammar for C include
- `identifier', `number', `string', plus one symbol for each keyword,
- operator or punctuation mark: `if', `return', `const', `static', `int',
- `char', `plus-sign', `open-brace', `close-brace', `comma' and many
- more. (These tokens can be subdivided into characters, but that is a
- matter of lexicography, not grammar.)
- Here is a simple C function subdivided into tokens:
- int /* keyword `int' */
- square (x) /* identifier, open-paren, */
- /* identifier, close-paren */
- int x; /* keyword `int', identifier, semicolon */
- { /* open-brace */
- return x * x; /* keyword `return', identifier, */
- /* asterisk, identifier, semicolon */
- } /* close-brace */
- The syntactic groupings of C include the expression, the statement,
- the declaration, and the function definition. These are represented in
- the grammar of C by nonterminal symbols `expression', `statement',
- `declaration' and `function definition'. The full grammar uses dozens
- of additional language constructs, each with its own nonterminal
- symbol, in order to express the meanings of these four. The example
- above is a function definition; it contains one declaration, and one
- statement. In the statement, each `x' is an expression and so is `x *
- x'.
- Each nonterminal symbol must have grammatical rules showing how it
- is made out of simpler constructs. For example, one kind of C
- statement is the `return' statement; this would be described with a
- grammar rule which reads informally as follows:
- A `statement' can be made of a `return' keyword, an `expression'
- and a `semicolon'.
- There would be many other rules for `statement', one for each kind of
- statement in C.
- One nonterminal symbol must be distinguished as the special one which
- defines a complete utterance in the language. It is called the "start
- symbol". In a compiler, this means a complete input program. In the C
- language, the nonterminal symbol `sequence of definitions and
- declarations' plays this role.
- For example, `1 + 2' is a valid C expression--a valid part of a C
- program--but it is not valid as an *entire* C program. In the
- context-free grammar of C, this follows from the fact that `expression'
- is not the start symbol.
- The Bison parser reads a sequence of tokens as its input, and groups
- the tokens using the grammar rules. If the input is valid, the end
- result is that the entire token sequence reduces to a single grouping
- whose symbol is the grammar's start symbol. If we use a grammar for C,
- the entire input must be a `sequence of definitions and declarations'.
- If not, the parser reports a syntax error.
- File: bison.info, Node: Grammar in Bison, Next: Semantic Values, Prev: Language and Grammar, Up: Concepts
- From Formal Rules to Bison Input
- ================================
- A formal grammar is a mathematical construct. To define the language
- for Bison, you must write a file expressing the grammar in Bison syntax:
- a "Bison grammar" file. *Note Bison Grammar Files: Grammar File.
- A nonterminal symbol in the formal grammar is represented in Bison
- input as an identifier, like an identifier in C. By convention, it
- should be in lower case, such as `expr', `stmt' or `declaration'.
- The Bison representation for a terminal symbol is also called a
- "token type". Token types as well can be represented as C-like
- identifiers. By convention, these identifiers should be upper case to
- distinguish them from nonterminals: for example, `INTEGER',
- `IDENTIFIER', `IF' or `RETURN'. A terminal symbol that stands for a
- particular keyword in the language should be named after that keyword
- converted to upper case. The terminal symbol `error' is reserved for
- error recovery. *Note Symbols::.
- A terminal symbol can also be represented as a character literal,
- just like a C character constant. You should do this whenever a token
- is just a single character (parenthesis, plus-sign, etc.): use that
- same character in a literal as the terminal symbol for that token.
- The grammar rules also have an expression in Bison syntax. For
- example, here is the Bison rule for a C `return' statement. The
- semicolon in quotes is a literal character token, representing part of
- the C syntax for the statement; the naked semicolon, and the colon, are
- Bison punctuation used in every rule.
- stmt: RETURN expr ';'
- ;
- *Note Syntax of Grammar Rules: Rules.
- File: bison.info, Node: Semantic Values, Next: Semantic Actions, Prev: Grammar in Bison, Up: Concepts
- Semantic Values
- ===============
- A formal grammar selects tokens only by their classifications: for
- example, if a rule mentions the terminal symbol `integer constant', it
- means that *any* integer constant is grammatically valid in that
- position. The precise value of the constant is irrelevant to how to
- parse the input: if `x+4' is grammatical then `x+1' or `x+3989' is
- equally grammatical.
- But the precise value is very important for what the input means
- once it is parsed. A compiler is useless if it fails to distinguish
- between 4, 1 and 3989 as constants in the program! Therefore, each
- token in a Bison grammar has both a token type and a "semantic value".
- *Note Defining Language Semantics: Semantics, for details.
- The token type is a terminal symbol defined in the grammar, such as
- `INTEGER', `IDENTIFIER' or `',''. It tells everything you need to know
- to decide where the token may validly appear and how to group it with
- other tokens. The grammar rules know nothing about tokens except their
- types.
- The semantic value has all the rest of the information about the
- meaning of the token, such as the value of an integer, or the name of an
- identifier. (A token such as `','' which is just punctuation doesn't
- need to have any semantic value.)
- For example, an input token might be classified as token type
- `INTEGER' and have the semantic value 4. Another input token might
- have the same token type `INTEGER' but value 3989. When a grammar rule
- says that `INTEGER' is allowed, either of these tokens is acceptable
- because each is an `INTEGER'. When the parser accepts the token, it
- keeps track of the token's semantic value.
- Each grouping can also have a semantic value as well as its
- nonterminal symbol. For example, in a calculator, an expression
- typically has a semantic value that is a number. In a compiler for a
- programming language, an expression typically has a semantic value that
- is a tree structure describing the meaning of the expression.
- File: bison.info, Node: Semantic Actions, Next: Bison Parser, Prev: Semantic Values, Up: Concepts
- Semantic Actions
- ================
- In order to be useful, a program must do more than parse input; it
- must also produce some output based on the input. In a Bison grammar,
- a grammar rule can have an "action" made up of C statements. Each time
- the parser recognizes a match for that rule, the action is executed.
- *Note Actions::.
- Most of the time, the purpose of an action is to compute the
- semantic value of the whole construct from the semantic values of its
- parts. For example, suppose we have a rule which says an expression
- can be the sum of two expressions. When the parser recognizes such a
- sum, each of the subexpressions has a semantic value which describes
- how it was built up. The action for this rule should create a similar
- sort of value for the newly recognized larger expression.
- For example, here is a rule that says an expression can be the sum of
- two subexpressions:
- expr: expr '+' expr { $$ = $1 + $3; }
- ;
- The action says how to produce the semantic value of the sum expression
- from the values of the two subexpressions.
- File: bison.info, Node: Bison Parser, Next: Stages, Prev: Semantic Actions, Up: Concepts
- Bison Output: the Parser File
- =============================
- When you run Bison, you give it a Bison grammar file as input. The
- output is a C source file that parses the language described by the
- grammar. This file is called a "Bison parser". Keep in mind that the
- Bison utility and the Bison parser are two distinct programs: the Bison
- utility is a program whose output is the Bison parser that becomes part
- of your program.
- The job of the Bison parser is to group tokens into groupings
- according to the grammar rules--for example, to build identifiers and
- operators into expressions. As it does this, it runs the actions for
- the grammar rules it uses.
- The tokens come from a function called the "lexical analyzer" that
- you must supply in some fashion (such as by writing it in C). The
- Bison parser calls the lexical analyzer each time it wants a new token.
- It doesn't know what is "inside" the tokens (though their semantic
- values may reflect this). Typically the lexical analyzer makes the
- tokens by parsing characters of text, but Bison does not depend on
- this. *Note The Lexical Analyzer Function `yylex': Lexical.
- The Bison parser file is C code which defines a function named
- `yyparse' which implements that grammar. This function does not make a
- complete C program: you must supply some additional functions. One is
- the lexical analyzer. Another is an error-reporting function which the
- parser calls to report an error. In addition, a complete C program must
- start with a function called `main'; you have to provide this, and
- arrange for it to call `yyparse' or the parser will never run. *Note
- Parser C-Language Interface: Interface.
- Aside from the token type names and the symbols in the actions you
- write, all variable and function names used in the Bison parser file
- begin with `yy' or `YY'. This includes interface functions such as the
- lexical analyzer function `yylex', the error reporting function
- `yyerror' and the parser function `yyparse' itself. This also includes
- numerous identifiers used for internal purposes. Therefore, you should
- avoid using C identifiers starting with `yy' or `YY' in the Bison
- grammar file except for the ones defined in this manual.
- File: bison.info, Node: Stages, Next: Grammar Layout, Prev: Bison Parser, Up: Concepts
- Stages in Using Bison
- =====================
- The actual language-design process using Bison, from grammar
- specification to a working compiler or interpreter, has these parts:
- 1. Formally specify the grammar in a form recognized by Bison (*note
- Bison Grammar Files: Grammar File.). For each grammatical rule in
- the language, describe the action that is to be taken when an
- instance of that rule is recognized. The action is described by a
- sequence of C statements.
- 2. Write a lexical analyzer to process input and pass tokens to the
- parser. The lexical analyzer may be written by hand in C (*note
- The Lexical Analyzer Function `yylex': Lexical.). It could also
- be produced using Lex, but the use of Lex is not discussed in this
- manual.
- 3. Write a controlling function that calls the Bison-produced parser.
- 4. Write error-reporting routines.
- To turn this source code as written into a runnable program, you
- must follow these steps:
- 1. Run Bison on the grammar to produce the parser.
- 2. Compile the code output by Bison, as well as any other source
- files.
- 3. Link the object files to produce the finished product.
- File: bison.info, Node: Grammar Layout, Prev: Stages, Up: Concepts
- The Overall Layout of a Bison Grammar
- =====================================
- The input file for the Bison utility is a "Bison grammar file". The
- general form of a Bison grammar file is as follows:
- %{
- C DECLARATIONS
- %}
-
- BISON DECLARATIONS
-
- %%
- GRAMMAR RULES
- %%
- ADDITIONAL C CODE
- The `%%', `%{' and `%}' are punctuation that appears in every Bison
- grammar file to separate the sections.
- The C declarations may define types and variables used in the
- actions. You can also use preprocessor commands to define macros used
- there, and use `#include' to include header files that do any of these
- things.
- The Bison declarations declare the names of the terminal and
- nonterminal symbols, and may also describe operator precedence and the
- data types of semantic values of various symbols.
- The grammar rules define how to construct each nonterminal symbol
- from its parts.
- The additional C code can contain any C code you want to use. Often
- the definition of the lexical analyzer `yylex' goes here, plus
- subroutines called by the actions in the grammar rules. In a simple
- program, all the rest of the program can go here.
- File: bison.info, Node: Examples, Next: Grammar File, Prev: Concepts, Up: Top
- Examples
- ********
- Now we show and explain three sample programs written using Bison: a
- reverse polish notation calculator, an algebraic (infix) notation
- calculator, and a multi-function calculator. All three have been tested
- under BSD Unix 4.3; each produces a usable, though limited, interactive
- desk-top calculator.
- These examples are simple, but Bison grammars for real programming
- languages are written the same way. You can copy these examples out of
- the Info file and into a source file to try them.
- * Menu:
- * RPN Calc:: Reverse polish notation calculator;
- a first example with no operator precedence.
- * Infix Calc:: Infix (algebraic) notation calculator.
- Operator precedence is introduced.
- * Simple Error Recovery:: Continuing after syntax errors.
- * Multi-function Calc:: Calculator with memory and trig functions.
- It uses multiple data-types for semantic values.
- * Exercises:: Ideas for improving the multi-function calculator.
- File: bison.info, Node: RPN Calc, Next: Infix Calc, Up: Examples
- Reverse Polish Notation Calculator
- ==================================
- The first example is that of a simple double-precision "reverse
- polish notation" calculator (a calculator using postfix operators).
- This example provides a good starting point, since operator precedence
- is not an issue. The second example will illustrate how operator
- precedence is handled.
- The source code for this calculator is named `rpcalc.y'. The `.y'
- extension is a convention used for Bison input files.
- * Menu:
- * Decls: Rpcalc Decls. Bison and C declarations for rpcalc.
- * Rules: Rpcalc Rules. Grammar Rules for rpcalc, with explanation.
- * Lexer: Rpcalc Lexer. The lexical analyzer.
- * Main: Rpcalc Main. The controlling function.
- * Error: Rpcalc Error. The error reporting function.
- * Gen: Rpcalc Gen. Running Bison on the grammar file.
- * Comp: Rpcalc Compile. Run the C compiler on the output code.
- File: bison.info, Node: Rpcalc Decls, Next: Rpcalc Rules, Up: RPN Calc
- Declarations for `rpcalc'
- -------------------------
- Here are the C and Bison declarations for the reverse polish notation
- calculator. As in C, comments are placed between `/*...*/'.
- /* Reverse polish notation calculator. */
-
- %{
- #define YYSTYPE double
- #include <math.h>
- %}
-
- %token NUM
-
- %% /* Grammar rules and actions follow */
- The C declarations section (*note The C Declarations Section: C
- Declarations.) contains two preprocessor directives.
- The `#define' directive defines the macro `YYSTYPE', thus specifying
- the C data type for semantic values of both tokens and groupings (*note
- Data Types of Semantic Values: Value Type.). The Bison parser will use
- whatever type `YYSTYPE' is defined as; if you don't define it, `int' is
- the default. Because we specify `double', each token and each
- expression has an associated value, which is a floating point number.
- The `#include' directive is used to declare the exponentiation
- function `pow'.
- The second section, Bison declarations, provides information to
- Bison about the token types (*note The Bison Declarations Section:
- Bison Declarations.). Each terminal symbol that is not a
- single-character literal must be declared here. (Single-character
- literals normally don't need to be declared.) In this example, all the
- arithmetic operators are designated by single-character literals, so the
- only terminal symbol that needs to be declared is `NUM', the token type
- for numeric constants.
|