index.rst 3.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109
  1. Parser Library
  2. ==============
  3. This library allows you to easily implement recursive-descent parsers.
  4. Installation
  5. ------------
  6. You can install this library through composer:
  7. .. code-block :: bash
  8. composer require jms/parser-lib
  9. or add it to your ``composer.json`` file directly.
  10. Example
  11. -------
  12. Let's assume that you would like to write a parser for a calculator. For simplicity
  13. sake, we will assume that the parser would already return the result of the
  14. calculation. Inputs could look like this ``1 + 1`` and we would expect ``2`` as
  15. a result.
  16. The first step, is to create a lexer which breaks the input string up into
  17. individual tokens which can then be consumed by the parser. This library provides
  18. a convenient class for simple problems which we will use::
  19. $lexer = new \JMS\Parser\SimpleLexer(
  20. '/
  21. # Numbers
  22. ([0-9]+)
  23. # Do not surround with () because whitespace is not meaningful for
  24. # our purposes.
  25. |\s+
  26. # Operators; we support only + and -
  27. |(+)|(-)
  28. /x', // The x modifier tells PCRE to ignore whitespace in the regex above.
  29. // This maps token types to a human readable name.
  30. array(0 => 'T_UNKNOWN', 1 => 'T_INT', 2 => 'T_PLUS', 3 => 'T_MINUS'),
  31. // This function tells the lexer which type a token has. The first element is
  32. // an integer from the map above, the second element the normalized value.
  33. function($value) {
  34. if ('+' === $value) {
  35. return array(2, '+');
  36. }
  37. if ('-' === $value) {
  38. return array(3, '-');
  39. }
  40. if (is_numeric($value)) {
  41. return array(1, (integer) $value);
  42. }
  43. return array(0, $value);
  44. }
  45. );
  46. Now the second step, is to create the parser which can consume the tokens once
  47. the lexer has split them::
  48. class MyParser extends \JMS\Parser\AbstractParser
  49. {
  50. const T_UNKNOWN = 0;
  51. const T_INT = 1;
  52. const T_PLUS = 2;
  53. const T_MINUS = 3;
  54. public function parseInternal()
  55. {
  56. $result = $this->match(self::T_INT);
  57. while ($this->lexer->isNextAny(array(self::T_PLUS, self::T_MINUS))) {
  58. if ($this->lexer->isNext(self::T_PLUS)) {
  59. $this->lexer->moveNext();
  60. $result += $this->match(self::T_INT);
  61. } else if ($this->lexer->isNext(self::T_MINUS)) {
  62. $this->lexer->moveNext();
  63. $result -= $this->match(self::T_INT);
  64. } else {
  65. throw new \LogicException('Previous ifs were exhaustive.');
  66. }
  67. }
  68. return $result;
  69. }
  70. }
  71. $parser = new MyParser($lexer);
  72. $parser->parse('1 + 1'); // int(2)
  73. $parser->parse('5 + 10 - 4'); // int(11)
  74. That's it. Now you can perform basic operations already. If you like you can now
  75. also replace the hard-coded integers in the lexer with the class constants of the
  76. parser.
  77. License
  78. -------
  79. The code is released under the business-friendly `Apache2 license`_.
  80. Documentation is subject to the `Attribution-NonCommercial-NoDerivs 3.0 Unported
  81. license`_.
  82. .. _Apache2 license: http://www.apache.org/licenses/LICENSE-2.0.html
  83. .. _Attribution-NonCommercial-NoDerivs 3.0 Unported license: http://creativecommons.org/licenses/by-nc-nd/3.0/