lexical-structure.lagda.rst 10 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334
  1. ..
  2. ::
  3. module language.lexical-structure where
  4. open import Agda.Builtin.String
  5. .. _lexical-structure:
  6. *****************
  7. Lexical Structure
  8. *****************
  9. Agda code is written in UTF-8 encoded plain text files with the
  10. extension ``.agda``. Most unicode characters can be used in
  11. identifiers and whitespace is important, see :ref:`names` and
  12. :ref:`lexical-structure-layout` below.
  13. Tokens
  14. ------
  15. .. _keywords-and-special-symbols:
  16. Keywords and special symbols
  17. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  18. Most non-whitespace unicode can be used as part of an Agda name, but there are
  19. two kinds of exceptions:
  20. special symbols
  21. Characters with special meaning that cannot appear at all in a name. These are
  22. ``.;{}()@"``.
  23. keywords
  24. Reserved words that cannot appear as a :ref:`name part <names>`, but
  25. can appear in a name together with other characters.
  26. ``=`` ``|`` ``->`` ``→`` ``:`` ``?`` ``\`` ``λ``
  27. :ref:`∀ <notational-conventions>` ``..`` ``...``
  28. ``abstract``
  29. ``coinductive``
  30. ``constructor``
  31. ``data``
  32. :ref:`do <do-notation>`
  33. ``eta-equality``
  34. ``field``
  35. :ref:`forall <notational-conventions>`
  36. ``import``
  37. ``in``
  38. ``inductive``
  39. ``infix``
  40. ``infixl``
  41. ``infixr``
  42. ``instance``
  43. ``interleaved``
  44. ``let``
  45. :ref:`macro <macros>`
  46. ``module``
  47. ``mutual``
  48. ``no-eta-equality``
  49. ``open``
  50. :ref:`overlap <instance-fields>`
  51. ``pattern``
  52. ``postulate``
  53. ``primitive``
  54. ``private``
  55. :ref:`quote <reflection>`
  56. :ref:`quoteTerm <macros>`
  57. ``record``
  58. ``rewrite``
  59. ``Set``
  60. ``syntax``
  61. ``tactic``
  62. :ref:`unquote <macros>`
  63. :ref:`unquoteDecl <unquoting-declarations>`
  64. :ref:`unquoteDef <unquoting-declarations>`
  65. :ref:`variable <generalization-of-declared-variables>`
  66. ``where``
  67. ``with``
  68. The ``Set`` keyword can appear with a natural number suffix, optionally
  69. subscripted (see :ref:`sort-system`). For instance ``Set42`` and
  70. ``Set₄₂`` are both keywords.
  71. keywords in import directives
  72. The following words are only reserved in import directives
  73. (in connection with ``import`` or ``open``):
  74. ``public``
  75. ``using``
  76. ``hiding``
  77. ``renaming``
  78. ``to``
  79. .. _names:
  80. Names
  81. ~~~~~
  82. A `qualified name` is a non-empty sequence of `names` separated by
  83. dots (``.``). A `name` is an alternating sequence of `name parts` and
  84. underscores (``_``), containing at least one name part. A `name part`
  85. is a non-empty sequence of unicode characters, excluding whitespace,
  86. ``_``, and :ref:`special symbols <keywords-and-special-symbols>`. A
  87. name part cannot be one of the
  88. :ref:`keywords <keywords-and-special-symbols>` above, and cannot start
  89. with a single quote, ``'`` (which are used for character literals, see
  90. Literals_ below).
  91. Examples
  92. - Valid: ``data?``, ``::``, ``if_then_else_``, ``0b``, ``_⊢_∈_``, ``x=y``
  93. - Invalid: ``data_?``, ``foo__bar``, ``_``, ``a;b``, ``[_.._]``
  94. The underscores in a name indicate where the arguments go when the name is used
  95. as an operator. For instance, the application ``_+_ 1 2`` can be written as ``1
  96. + 2``. See :ref:`mixfix-operators` for more information. Since most sequences
  97. of characters are valid names, whitespace is more important than in other
  98. languages. In the example above the whitespace around ``+`` is required, since
  99. ``1+2`` is a valid name.
  100. Qualified names are used to refer to entities defined in other modules. For
  101. instance ``Prelude.Bool.true`` refers to the name ``true`` defined in the
  102. module ``Prelude.Bool``. See :ref:`module-system` for more information.
  103. .. _lexical-structure-literals:
  104. Literals
  105. ~~~~~~~~
  106. There are four types of literal values: integers, floats, characters, and
  107. strings. See :ref:`built-ins` for the corresponding types, and
  108. :ref:`literal-overloading` for how to support literals for user-defined types.
  109. .. _lexical-structure-int-literals:
  110. Integers
  111. Integer values in decimal, hexadecimal (prefixed by ``0x``), or binary
  112. (prefixed by ``0b``) notation. The character `_` can be used to separate
  113. groups of digits. Non-negative numbers map by default to :ref:`built-in
  114. natural numbers <built-in-nat>`, but can be overloaded. Negative numbers have
  115. no default interpretation and can only be used through :ref:`overloading
  116. <literal-overloading>`.
  117. Examples: ``123``, ``0xF0F080``, ``-42``, ``-0xF``, ``0b11001001``,
  118. ``1_000_000_000``, ``0b01001000_01001001``.
  119. .. _lexical-structure-float-literals:
  120. Floats
  121. Floating point numbers in the standard notation (with square brackets
  122. denoting optional parts):
  123. .. code-block:: none
  124. float ::= [-] decimal . decimal [exponent]
  125. | [-] decimal exponent
  126. exponent ::= (e | E) [+ | -] decimal
  127. These map to :ref:`built-in floats <built-in-float>` and cannot be overloaded.
  128. Examples: ``1.0``, ``-5.0e+12``, ``1.01e-16``, ``4.2E9``, ``50e3``.
  129. .. _characters:
  130. .. _lexical-structure-char-literals:
  131. Characters
  132. Character literals are enclosed in single quotes (``'``). They can be a
  133. single (unicode) character, other than ``'`` or ``\``, or an escaped
  134. character. Escaped characters start with a backslash ``\`` followed by an
  135. escape code. Escape codes are natural numbers in decimal or hexadecimal
  136. (prefixed by ``x``) between ``0`` and ``0x10ffff`` (``1114111``), or one of
  137. the following special escape codes:
  138. ======== ======== ======== ======== ======== ======== ======== ========
  139. Code ASCII Code ASCII Code ASCII Code ASCII
  140. ======== ======== ======== ======== ======== ======== ======== ========
  141. ``a`` 7 ``b`` 8 ``t`` 9 ``n`` 10
  142. ``v`` 11 ``f`` 12 ``\`` ``\`` ``'`` ``'``
  143. ``"`` ``"`` ``NUL`` 0 ``SOH`` 1 ``STX`` 2
  144. ``ETX`` 3 ``EOT`` 4 ``ENQ`` 5 ``ACK`` 6
  145. ``BEL`` 7 ``BS`` 8 ``HT`` 9 ``LF`` 10
  146. ``VT`` 11 ``FF`` 12 ``CR`` 13 ``SO`` 14
  147. ``SI`` 15 ``DLE`` 16 ``DC1`` 17 ``DC2`` 18
  148. ``DC3`` 19 ``DC4`` 20 ``NAK`` 21 ``SYN`` 22
  149. ``ETB`` 23 ``CAN`` 24 ``EM`` 25 ``SUB`` 26
  150. ``ESC`` 27 ``FS`` 28 ``GS`` 29 ``RS`` 30
  151. ``US`` 31 ``SP`` 32 ``DEL`` 127
  152. ======== ======== ======== ======== ======== ======== ======== ========
  153. Character literals map to the :ref:`built-in character type <built-in-char>` and
  154. cannot be overloaded.
  155. Examples: ``'A'``, ``'∀'``, ``'\x2200'``, ``'\ESC'``, ``'\32'``, ``'\n'``,
  156. ``'\''``, ``'"'``.
  157. .. _lexical-structure-string-literals:
  158. Strings
  159. String literals are sequences of, possibly escaped, characters
  160. enclosed in double quotes ``"``. They follow the same rules as
  161. :ref:`character literals <characters>` except that double quotes
  162. ``"`` need to be escaped rather than single quotes ``'``. String
  163. literals map to the :ref:`built-in string type <built-in-string>` by
  164. default, but can be :ref:`overloaded <overloaded-strings>`.
  165. Example: ``"Привет \"мир\"\n"``.
  166. Holes
  167. ~~~~~
  168. Holes are an integral part of the interactive development supported by the
  169. :any:`Emacs mode <emacs-mode>`. Any text enclosed in ``{!`` and ``!}`` is a
  170. hole and may contain nested holes. A hole with no contents can be written
  171. ``?``. There are a number of Emacs commands that operate on the contents of a
  172. hole. The type checker ignores the contents of a hole and treats it as an
  173. unknown (see :ref:`implicit-arguments`).
  174. Example: ``{! f {!x!} 5 !}``
  175. Comments
  176. ~~~~~~~~
  177. Single-line comments are written with a double dash ``--`` followed by
  178. arbitrary text. Multi-line comments are enclosed in ``{-`` and ``-}``
  179. and can be nested. Comments cannot appear in :ref:`string
  180. literals <lexical-structure-string-literals>`.
  181. Example::
  182. {- Here is a {- nested -}
  183. comment -}
  184. s : String --line comment {-
  185. s = "{- not a comment -}"
  186. Pragmas
  187. ~~~~~~~
  188. Pragmas are special comments enclosed in ``{-#`` and ``#-}`` that have special
  189. meaning to the system. See :ref:`pragmas` for a full list of pragmas.
  190. .. _lexical-structure-layout:
  191. Layout
  192. ------
  193. Agda is layout sensitive using similar rules as Haskell, with the exception
  194. that layout is mandatory: you cannot use explicit ``{``, ``}`` and ``;`` to
  195. avoid it.
  196. A layout block contains a sequence of `statements` and is started by one of the
  197. layout keywords:
  198. .. code-block:: none
  199. abstract
  200. constructor
  201. do
  202. field
  203. instance
  204. let
  205. macro
  206. mutual
  207. postulate
  208. primitive
  209. private
  210. variable
  211. where
  212. The first token after the layout keyword decides the indentation of the block.
  213. Any token indented more than this is part of the previous statement, a token at
  214. the same level starts a new statement, and a token indented less lies outside
  215. the block.
  216. ::
  217. data Nat : Set where -- starts a layout block
  218. -- comments are not tokens
  219. zero : Nat -- statement 1
  220. suc : Nat → -- statement 2
  221. Nat -- also statement 2
  222. one : Nat -- outside the layout block
  223. one = suc zero
  224. Note that the indentation of the layout keyword does not matter.
  225. If several layout blocks are started by layout keywords without line
  226. break in between (where line breaks inside block comments do not
  227. count), then those blocks indented *more* than the last block go
  228. passive, meaning they cannot be further extended by new statements::
  229. private module M where postulate
  230. A : Set -- module-block goes passive
  231. B : Set -- postulate-block can still be extended
  232. module N where -- private-block can still be extended
  233. An Agda file contains one top-level layout block, with the special rule that
  234. the contents of the top-level module need not be indented.
  235. ::
  236. module Example where
  237. NotIndented : Set₁
  238. NotIndented = Set
  239. Literate Agda
  240. -------------
  241. Agda supports `literate programming <literate_>`_ with multiple typesetting
  242. tools like LaTeX, Markdown and reStructuredText. For instance, with LaTeX,
  243. everything in a file is a comment unless enclosed in ``\begin{code}``,
  244. ``\end{code}``. Literate Agda files have special file extensions, like
  245. ``.lagda`` and ``.lagda.tex`` for LaTeX, ``.lagda.md`` for Markdown,
  246. ``.lagda.rst`` for reStructuredText instead of ``.agda``. The main use case
  247. for literate Agda is to generate LaTeX documents from Agda code. See
  248. :any:`generating-html` and :any:`generating-latex` for more information.
  249. .. code-block:: latex
  250. \documentclass{article}
  251. % some preamble stuff
  252. \begin{document}
  253. Introduction usually goes here
  254. \begin{code}
  255. module MyPaper where
  256. open import Prelude
  257. five : Nat
  258. five = 2 + 3
  259. \end{code}
  260. Now, conclusions!
  261. \end{document}
  262. .. _literate: https://en.wikipedia.org/wiki/Literate_programming