README 7.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165
  1. Introduction
  2. ============
  3. LLnextgen is a (partial) reimplementation of the LLgen Extended-LL(1) parser
  4. generator [http://www.cs.vu.nl/~ceriel/LLgen.html] created by D. Grune and
  5. C.J.H. Jacobs which is part of the Amsterdam Compiler Kit (ACK). LLnextgen is
  6. Licensed under the GNU General Public License version 3. See the file COPYING
  7. for details. Alternatively, see <http://www.gnu.org/licenses/>.
  8. Note: To add to the confusion, there exists or existed another program called
  9. LLgen, which is an LL(1) parser generator. It was created by Fischer and
  10. LeBlanc.
  11. Motivation
  12. ==========
  13. I like the ideas embodied in the LLgen program and I find the way to specify
  14. grammars easy and intuitive. However, it turns out LLgen contains a number of
  15. more and less serious bugs that make it annoying to work with.
  16. One option of course was to fix the LLgen program, but it turned out that it
  17. was not written with maintainability in mind. Furthermore, it was written in
  18. a time when memory was expensive and therefore limited. This results in a
  19. number of hacks that complicate maintenance even further. Thus, I decided to
  20. do a rewrite. The rewrite also allowed several features to be implemented
  21. which LLgen was missing (in my opinion anyway).
  22. Compatibility (issues)
  23. ======================
  24. At this time the basic LLgen functionality is implemented. This includes
  25. everything apart from the extended user error-handling with the %onerror
  26. directive and the non-correcting error-recovery.
  27. Although I've tried to copy the behaviour of LLgen accurately, I have
  28. implemented some aspects slightly differently. The following is a list of the
  29. differences in behaviour between LLgen and LLnextgen:
  30. - LLgen generated both K&R style C code and ANSI C code. LLnextgen only
  31. supports generation of ANSI C code.
  32. - There is a minor difference in the determination of the default choices.
  33. LLnextgen simply chooses the first production with the shortest possible
  34. terminal production, while LLgen also takes the complexity in terms of
  35. non-terminals and terms into account. There is also a minor difference when
  36. there is more than one shortest alternative and some of them are marked with
  37. %avoid. Both differences are not very important as the user can specify
  38. which alternative should be the default, thereby circumventing the
  39. differences in the algorithms.
  40. - The default behaviour of generating one output C file per input and Lpars.c
  41. and Lpars.h has been changed in favour of generating one .c file and one .h
  42. file. The rationale given for creating multiple output files in the first
  43. place was that it would reduce the compilation time for the generated
  44. parser. As computation power has become much more abundant this feature is
  45. no longer necessary, and the difficult interaction with the make program
  46. makes it undesirable. The LLgen behaviour is still supported through a
  47. command-line switch.
  48. - in LLgen one could have a parser and a %first macro with the same name.
  49. LLnextgen forbids this, as it leads to name collisions in the new file
  50. naming scheme. For the old LLgen file naming scheme it could also easily
  51. lead to name collisions, although they could be circumvented by not mentioning
  52. the parser in any of the C code in the .g files.
  53. - LLgen names the labels it generates L_X, where X is a number. LLnextgen names
  54. these LL_X.
  55. - LLgen parsers are always reentrant. As this feature is hardly ever used,
  56. LLnextgen parsers are non-reentrant unless the option --reentrant is used.
  57. Extra features
  58. ==============
  59. LLnextgen incorporates a number of features that where not available in the
  60. LLgen program:
  61. - Tracing of conflicts. LLgen can only indicate where a conflict is detected,
  62. but not where it is caused. As the cause may be in a seemingly unrelated
  63. rule, conflicts can be very hard to find. LLnextgen can trace the cause of
  64. conflicts, making it much easier to resolve them.
  65. - Automatic token buffering. LLgen and LLnextgen require that the token last
  66. retrieved from the lexical analyser is returned again after a parse error
  67. was detected. Most lexical analysers do not provide this feature, and LLgen
  68. users are required to do this themselves. As this almost always leads to the
  69. same code, LLnextgen can provide this code itself, or can be asked to print
  70. the default code to standard output as a basis for modifications.
  71. - A symbol table can be auto-generated if the needed information is supplied.
  72. - A default LLmessage routine can be generated (if the auto-generated symbol
  73. table is used), or alternatively sent to the standard output.
  74. - The limitation of the maximum file-name length in LLgen has been removed.
  75. - A command-line switch is provided that makes LLnextgen as compatible with
  76. LLgen as possible, as well as a number of switches that can turn on separate
  77. compatibility aspects.
  78. - Separating parameters in non-terminal headers can now be done with comma's.
  79. LLnextgen will issue a warning about using a semi-colon to separate
  80. parameters. This warning can be suppressed with a command-line switch.
  81. - File inclusion is possible through the %include directive. Dependency
  82. information can be generated for use in Makefile's.
  83. - Command line options can be set in the grammar itself through %options.
  84. - The parser can be stopped through the LLabort() call, if it has been enabled.
  85. - Thread-safe parsers.
  86. - Return values for non-terminals.
  87. - An extra repetition operator for specify that the last element in a
  88. repeating term is optional for the last repetition of that term.
  89. Several other features are planned. See the file TODO for details.
  90. Prerequisites and installation
  91. ==============================
  92. LLnextgen is written in pure ANSI C, so most C compilers should have little
  93. trouble compiling it. From version 0.3.0, LLnextgen has optional support for
  94. regular expression matching. As this is not part of the ANSI C specification,
  95. a mechanism has been introduced to allow automatic testing for POSIX regular
  96. expression availablity. Therefore, there are three different ways to compile
  97. LLnextgen:
  98. Using the configure script:
  99. ---
  100. $ ./configure
  101. or
  102. $ ./configure --prefix=/usr
  103. (see ./configure --help for more tuning options)
  104. $ make all
  105. $ make install
  106. (assumes working install program)
  107. Manually editing the Makefile to suit your installation:
  108. ---
  109. $ cp Makefile.in Makefile
  110. Edit the values for REGEX, REGEXLIBS and prefix
  111. $ make all
  112. $ make install
  113. (assumes working install program)
  114. Manually compiling LLnextgen:
  115. ---
  116. $ cd src
  117. $ cp lexer.c.dist lexer.c
  118. $ cp grammar.c.dist grammar.c
  119. $ cp grammar.h.dist grammar.h
  120. $ cc -o LLnextgen *.c
  121. or to compile with regular expression support, add
  122. -DREGEX={POSIX,OLDPOSIX,PCRE} and if required -l{regex,pcreposix} to your
  123. compiler command line (see Makefile.in for details on the values). After this
  124. your LLnextgen executable is done, and all that is left to do is to install
  125. it, and the documentation, into the target directories.
  126. Remarks:
  127. ---
  128. LLnextgen is known to compile and work on several flavours of Un*x, on both
  129. 32 and 64 bit platforms and on Windows. For compilation on Windows with MS
  130. Visual C++, the Makefile.win32 file is provided for use with nmake.exe.
  131. Reporting bugs
  132. ==============
  133. If you think you have found a bug, please check that you are using the latest
  134. version of LLnextgen [http://os.ghalkes.nl/LLnextgen]. When reporting bugs,
  135. please include a minimal grammar that demonstrates the problem.
  136. Author
  137. ======
  138. Gertjan Halkes <llnextgen@ghalkes.nl>