snusp-1.0-spec-wd1.tex 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554
  1. \documentclass[a4paper]{article}
  2. \makeatletter
  3. %\newenvironment{dashdescription} {\list{}{\labelwidth\z@
  4. %\itemindent-\leftmargin \let\makelabel\dashdescriptionlabel}} {\endlist}
  5. %
  6. %\newcommand*\dashdescriptionlabel[1]{\hspace\labelsep \normalfont\bfseries
  7. %#1---}
  8. %
  9. %\makebox[\textwidth]{\hrulefill}
  10. \newcommand\comment[2]{\begin{description} \item[#1] #2 \end{description}}
  11. \newcommand\rationale[1]{\comment{Rationale:}{#1}}
  12. \newcommand\note[1]{\comment{Note:}{#1}}
  13. \newcommand\issue[1]{{\comment{\textit{Issue:}}{\it #1}}}
  14. \newcommand\todo[1]{{\comment{\textit{Todo:}}{\it #1}}}
  15. \makeatother
  16. \title{\textsc{SNUSP} 1.0 Language Specification\\Working Draft 1}
  17. \author{Daniel Brockman}
  18. \begin{document}
  19. \maketitle
  20. \tableofcontents
  21. %=============================================================================
  22. \section{Introduction}
  23. The \textsc{SNUSP} language was created in September, 2003 to develop a
  24. complete and utter fucking waste of time. (The name \textsc{SNUSP} is a
  25. recursive acronym for ``\textsc{SNUSP}'s Not \textsc{Unix}, but Structured
  26. \textsc{Path}.'') We are currently evaluating the possibilities of developing
  27. a \textsc{SNUSP} operating system kernel. However, variants of the
  28. \textsc{SNUSP} system, which use the \textsc{Linux} kernel, are already in
  29. use; though these systems are often referred to as ``\textsc{Linux},'' they
  30. are more accurately called ``\textsc{SNUSP}/\textsc{Linux} systems.''
  31. \issue{This is not even funny. How do you write an introduction to something
  32. like this?}
  33. \subsection{History}
  34. One rainy night in August, 2003, Francis Rogers was sitting in his apartment
  35. in [where his lives] experimenting with \textsc{C}. Inspired by the
  36. remarkably beautiful and symmetrical eight-instruction classic
  37. \textsc{Brainfuck}, as well as the crazy multi-dimensional stack-shuffling
  38. language \textsc{Befunge}, he was writing an interpreter for a language he
  39. would later come to call \textsc{Path}. Borrowing the basic instructions and
  40. linear memory model from \textsc{Brainfuck}, and the two-dimensional code
  41. space from \textsc{Befunge}, he created a language both simple to understand
  42. and simple to use. Once Rogers realized that he had created something
  43. interesting---that is, once he got the interpreter to run a
  44. spectacular bell-emmitting program---he immediately posted the source code and
  45. a quick rundown on the language to the Something Awful
  46. Forums\footnote{\texttt{<http://forums.somethingawful.com/>}} for peer review.
  47. \textsc{Path} was highly appreciated as a respectable middle-ground by
  48. everyone who adored \textsc{Brainfuck} but was scared by \textsc{Befunge} (or
  49. vice versa), and a few who seemed new to programming but decided to pick up
  50. \textsc{Path} because it looked so cute. Not very surprisingly, everyone else
  51. thought the language looked horribly obfuscated, and immediately started to
  52. question the sanity of everyone involved with its development. Nevertheless,
  53. several \textsc{Path} tools created by enthusiasts popped up over the course
  54. of a week: interpreters written in \textsc{C}, \textsc{C++}, \textsc{Perl},
  55. and \textsc{Java}; debuggers for \textsc{Tk}, \textsc{Windows}, and
  56. \textsc{Swing}; and a simple web-based interpreter interface written in
  57. \textsc{PHP}.
  58. The original \textsc{Path} was not perfect, however, and as more suggestions
  59. for improving \textsc{Path} were made and implemented by the interpreter
  60. writers, the language borders started to blur. Every \textsc{Path} coder had
  61. his own flavour---\textsc{SNUSP} was one of the more well-defined ones---but
  62. noone could say what \textsc{Path} really was anymore. To sort out this mess,
  63. Rogers announced that he wished to keep the name ``\textsc{Path}'' for his
  64. original version of the language, and asked everybody who wanted changes to
  65. fork off under a new name (no pun intended). This opened the door for the
  66. \textsc{SNUSP} project to begin serious work on defining a completely
  67. independent new language derived from traditional \textsc{Path}.
  68. \subsection{Goals}
  69. The \textsc{SNUSP} language, with its roots in \textsc{Path}, is intended to
  70. be an \ae sthetically pleasing, modular language with an orthogonal
  71. instruction set and a bright future. This specification defines three
  72. increasingly sophisticated levels of the \textsc{SNUSP} language:
  73. \begin{description}
  74. \item[\textsc{Core SNUSP}] is---like traditional \textsc{Path}---essentially a
  75. modification of \textsc{Brainfuck} to use a two-di\-men\-sion\-al code space;
  76. \item[\textsc{Modular SNUSP}] is an extension of \textsc{Core SNUSP}, adding a
  77. subroutine mechanism; finally,
  78. \item[\textsc{Bloated SNUSP}] is an extension of \textsc{Modular SNUSP},
  79. adding support for indeterminism, concurrency, and a second data memory
  80. dimension.
  81. \end{description}
  82. The first and second levels are theoretically complete; it it unlikely that
  83. future versions of this specification will alter them. The third level, on
  84. the other hand, is specifically designated for new features---particularly
  85. ones that add bloat.
  86. Plans exist on developing a standard library in \textsc{Modular SNUSP}, with
  87. the goal of increasing the viability of \textsc{SNUSP} as a development
  88. platform for mission-critical applications. It will factor out certain basic
  89. building blocks and provide subroutines for mathematical functions, string
  90. manipulation, etc.
  91. %=============================================================================
  92. \section{Memory}
  93. There are three kinds of run-time memory in \textsc{SNUSP}:
  94. \begin{description}
  95. \item[code space] contains run-time representations of program source;
  96. \item[data memory] (or simply ``memory'') contains integers that are
  97. accessed and modified by \textsc{SNUSP} programs when carrying out their task;
  98. finally,
  99. \item[the call stack] (used in \textsc{Modular SNUSP}) is, in familiar terms,
  100. a FILO queue storing the return addresses of subroutine calls, i.e.,
  101. \textbf{enter} instructions.
  102. \end{description}
  103. \subsection{Memory Units}
  104. Code space and data memory are both two-dimensional and made up of units
  105. called, respectively, \emph{code cells} and \emph{data cells}. The call stack
  106. is one-dimensional and made up of \emph{stack frames}.
  107. \note{The second data memory dimension can be exploited only by programs
  108. written in \textsc{Bloated SNUSP}. In lower levels of \textsc{SNUSP}, data
  109. memory is effectively one-dimensional, since the data pointer can only move in
  110. two opposite directions---\textbf{left} and \textbf{right}.}
  111. \note{The term ``stack frame'' normally refers to both the return address and
  112. the local data of a subroutine. However, in \textsc{SNUSP} there is no such
  113. thing as ``local data,'' and return addresses are completely separated from
  114. data memory. As a practical convention, most subroutines guarantee the
  115. invariance of previous memory; but since the language does not actually define
  116. subroutines, there is nothing to enforce this.}
  117. \subsection{Accessibility}
  118. Unlike in \textsc{Befunge}, code space is completely inaccessible for
  119. inspection or change by \textsc{SNUSP} programs; it is only used internally by
  120. the interpreter. Thus, once the interpreter has loaded a program, code space
  121. does not change until another program is loaded.
  122. Data memory, on the other hand, is completely accessible to \textsc{SNUSP}
  123. programs as mutable working storage---just like in \textsc{Brainfuck}.
  124. The call stack is accessible to \textsc{SNUSP} programs as a side-effect of
  125. the \textbf{enter} and \textbf{leave} instructions. However, it cannot be
  126. randomly accessed.
  127. \subsection{Limitations}
  128. The following limitations apply to the three memory sections:
  129. \begin{itemize}
  130. \item Code space is bounded in all directions, and it is impossible for the
  131. instruction pointer to point outside it.
  132. \item Data memory can grow as large as physical memory restrictions allow it
  133. to. However, it is bounded in both dimensions: If at any point the number of
  134. times the data pointer has been moved to the left exceeds the number of times
  135. it has been moved to the right, the resulting behavior is undefined. The
  136. equivalent is true for the orthogonal dimension: The number of moves upwards
  137. must not exceed the number of moves downwards.
  138. \rationale{This does not practically impose a limit on normal \textsc{SNUSP}
  139. programs, but simplifies the implementation of interpreters.}
  140. \issue{This is the most obvious irregulatity that I know about in the SNUSP
  141. language. Should we define what happens if the data pointer falls off? We
  142. have three choices:
  143. \begin{itemize}
  144. \item Leave it undefined. This leaves a hole in the language, but maybe this
  145. is the way it should be.
  146. \item Define the behavior. Terminating the process seems to be the only
  147. reasonable choice here, but it is not elegant.
  148. \item Remove the boundaries altogether, eliminating the issue. This seems
  149. to be the most elegant solution. Can you live with this, interpreter writers?
  150. \end{itemize}}
  151. \item The call stack is unbounded and can grow as high as physical memory
  152. limitations allow it to.
  153. \end{itemize}
  154. %=============================================================================
  155. \section{Syntax}
  156. \textsc{SNUSP} source files are read and transplanted into code space one line
  157. at a time. A conforming \textsc{SNUSP} interpreter is required to recognize
  158. all of the following character sequences as end-of-line indicators:
  159. \begin{itemize}
  160. \item carriage return (13), line feed (10)
  161. \item carriage return (13)
  162. \item line feed (10)
  163. \end{itemize}
  164. Further, when loading a source file, conforming interpreters must behave as if
  165. all lines were padded to the right with spaces (32), so as to make all lines
  166. equally long.
  167. \subsection{Instruction Characters}
  168. When each line is read into code memory from the source file, the source
  169. characters are translated to instructions according to the following table:
  170. \begin{center}\begin{tabular}{|ccc|}
  171. \hline
  172. \textsc{ASCII} & Glyph & Instruction \\
  173. \hline \hline
  174. \multicolumn{3}{|c|}{\textsc{Bloated SNUSP}} \\
  175. 37 & \verb"%" & \textbf{rand} \\
  176. 38 & \verb"&" & \textbf{split} \\
  177. 59 & \verb";" & \textbf{down} \\
  178. 58 & \verb":" & \textbf{up} \\
  179. \hline
  180. \multicolumn{3}{|c|}{\textsc{Modular SNUSP}} \\
  181. 64 & \verb"@" & \textbf{enter} \\
  182. 35 & \verb"#" & \textbf{leave} \\
  183. \hline
  184. \multicolumn{3}{|c|}{\textsc{Core SNUSP}} \\
  185. 62 & \verb">" & \textbf{right} \\
  186. 60 & \verb"<" & \textbf{left} \\
  187. 43 & \verb"+" & \textbf{incr} \\
  188. 45 & \verb"-" & \textbf{decr} \\
  189. 44 & \verb"," & \textbf{read} \\
  190. 46 & \verb"." & \textbf{write} \\
  191. 47 & \verb"/" & \textbf{ruld} \\
  192. 92 & \verb"\" & \textbf{lurd} \\
  193. 33 & \verb"!" & \textbf{skip} \\
  194. 63 & \verb"?" & \textbf{skipz} \\
  195. \hline
  196. 32 & \verb" " & \textbf{noop} \\
  197. 61 & \verb"=" & \textbf{noop} \\
  198. 124 & \verb"|" & \textbf{noop} \\
  199. \hline
  200. \end{tabular}\end{center}
  201. All other characters translate to \textbf{noop} instructions.
  202. \subsection{The Starting Indicator}
  203. The \emph{starting indicator} tells the interpreter where to begin execution.
  204. If the source file contains any dollar signs (36), the first one to appear is
  205. the starting indicator; otherwise, the first character---whatever it may
  206. be---is the starting indicator.
  207. %=============================================================================
  208. \section{Execution}
  209. A \textsc{SNUSP} program may be executed indirectly through an interpreter, or
  210. directly as a stand-alone process with a built-in interpreter. In any case,
  211. when a \textsc{SNUSP} program is invoked, there is no way to pass arguments to
  212. it; the only way to give it input it is through the standard input stream.
  213. The program, however, can give output---apart from through the standard output
  214. stream---via the process exit code.
  215. \subsection{Variables}
  216. During execution three variables are used to keep track of the program state,
  217. apart from the various kinds of memory:
  218. \begin{description}
  219. \item[the instruction pointer] that points to an instruction in code space
  220. called the \emph{current instruction},
  221. \item[the data pointer] that points to a cell in data memory called the
  222. \emph{current data cell}, and
  223. \item[the current direction] that indicates direction in which the instruction
  224. pointer is moving.
  225. \end{description}
  226. \todo{Maybe add a section about threads here.}
  227. \subsection{Ticks and Turns}
  228. At the start of execution, a thread is created, its instruction pointer is set
  229. to point to the cell that contains the starting indicator, and its current
  230. direction is set to \textbf{right}. Its call stack starts out empty and the
  231. data memory originally contains nothing but zeroes.
  232. Execution of a \textsc{SNUSP} program is then carried out in small steps
  233. called \emph{ticks}. Each thread gets one \emph{turn} per tick, but the order
  234. in which the turns are taken is undefined. The thread that is currently
  235. taking its turn is called the \emph{active thread}. A turn proceeds as
  236. follows:
  237. \begin{enumerate}
  238. \item The current instruction is carried out.
  239. \item The instruction pointer is moved one step in the current direction
  240. unless this would cause the instruction pointer to point outside code space,
  241. in which case the active thread is \emph{stopped}.
  242. \end{enumerate}
  243. When a thread is stopped, all its resources are released and it ceases taking
  244. turns. When all threads are stopped, the process terminates with the exit
  245. code set to the value of the current memory cell of the last thread to take a
  246. turn.
  247. %=============================================================================
  248. \section{Instructions}
  249. All instructions in \textsc{SNUSP} are atomic, in the sense that there are no
  250. real syntactic or semantic restrictions on how they are to be combined. Some
  251. instructions access and/or mutate the current memory cell, but no other parts
  252. of data memory are ever touched.
  253. The \textbf{noop} instruction is special, as it actually denotes \emph{lack}
  254. of any instruction at all:
  255. \begin{description}
  256. \item[noop] (\verb" ", \verb"|", \verb"=") Do nothing.
  257. \end{description}
  258. \subsection{\textsc{Core SNUSP}}
  259. The first six instructions in this set---\textbf{left}, \textbf{right},
  260. \textbf{incr}, \textbf{decr}, \textbf{read}, and \textbf{write}---are
  261. identical to their \textsc{Brainfuck} counterparts. The remaining
  262. four---\textbf{ruld}, \textbf{lurd}, \textbf{skip}, and
  263. \textbf{skipz}---replace the pair of looping instructions found in
  264. \textsc{Brainfuck}---\verb"[" and \verb"]"---as general-purpose flow control
  265. instructions that can be combined to create loops and similar code structures.
  266. \begin{description}
  267. \item[left] (\verb">") Move the data pointer one cell to the left.
  268. \item[right] (\verb"<") Move the data pointer one cell to the right.
  269. \item[incr] (\verb"+") If the value of the current data cell is less than the
  270. maximum allowed value, increment it; otherwise, set it to zero.
  271. \item[decr] (\verb"-") If the value of the current data cell is greater than
  272. zero, decrement it; otherwise, set it to the maximum allowed value.
  273. \item[read] (\verb",") Read a byte from standard input and put it in the
  274. current data cell. If the input stream is exhausted, block until more data
  275. becomes available.
  276. \item[write] (\verb".") If the value of the current data cell is
  277. representable by a single byte, write this byte to standard output.
  278. Otherwise, the behavior is implementation-defined.
  279. \issue{Ruling run-time errors out, there are a number of different methods for
  280. squeezing a 32-bit value into a byte: \begin{itemize} \item doing it modulo
  281. the maximum value, \item outputting zero, and \item outputting the maximum
  282. value. \end{itemize} Should we choose one of these?}
  283. \item[ruld] (\verb"\") If the current direction is \begin{itemize} \item
  284. \textbf{left}, change it to \textbf{up} \item \textbf{right}, change it to
  285. \textbf{down}, \end{itemize} and vice versa. (Mnemonic:
  286. right$\Longleftrightarrow$up, left$\Longleftrightarrow$down)
  287. \item[lurd] (\verb"/") If the current direction is \begin{itemize} \item
  288. \textbf{left}, change it to \textbf{up} \item \textbf{right}, change it to
  289. \textbf{down}, \end{itemize} and vice versa. (Mnemonic:
  290. left$\Longleftrightarrow$up, right$\Longleftrightarrow$down)
  291. \item[skip] (\verb"!") Move the instruction pointer one step in the current
  292. direction.
  293. \item[skipz] (\verb"?") If the value of the current data cell is zero, move the
  294. instruction pointer one step in the current direction; otherwise, do nothing.
  295. \end{description}
  296. \subsection{\textsc{Modular SNUSP}}
  297. This level adds two additional instructions, which provide the means for
  298. implementing subroutines in \textsc{SNUSP}.
  299. \begin{description}
  300. \item[enter] (\verb"@") Push the current direction and instruction pointer to
  301. the call stack.
  302. \item[leave] (\verb"#") If the call stack is empty, stop the active thread;
  303. otherwise, pop the topmost stackframe, set the current direction and
  304. instruction pointer to the values recieved from the stack, and move the
  305. instruction pointer one step in the current direction.
  306. \end{description}
  307. The following example demonstrates how to implement a subroutine called
  308. \verb"ECHO", using the \textbf{enter} and \textbf{leave} instructions, and how
  309. to call it twice from the main program execution path:
  310. \begin{verbatim}
  311. /==!/======ECHO==,==.==#
  312. | |
  313. $==>==@/==@/==<==#
  314. \end{verbatim}
  315. \subsection{\textsc{Bloated SNUSP}}
  316. This level adds four new instructions, for a grand total of sixteen
  317. \textsc{SNUSP} instructions. The first two simply add ways of moving through
  318. the second data memory dimension; this is particularly useful in the context
  319. of concurrency, which is provided by another instruction for starting new
  320. threads. The last instruction provides a way to obtain random numbers in
  321. arbitrary ranges.
  322. \begin{description}
  323. \item[up] (\verb":") Move the data pointer one cell upwards.
  324. \item[down] (\verb";") Move the data pointer one cell downwards.
  325. \item[split] (\verb"&") Create a new thread, and move the instruction pointer
  326. of the old thread one step in the current direction.
  327. \item[rand] (\verb"%") Set the value of the current data cell to a random
  328. number between zero and the current value of the cell, inclusive.
  329. \end{description}
  330. All threads share a single code space and a single data memory; however, each
  331. thread has its own instruction pointer, direction, memory pointer, and call
  332. stack. Upon thread creation, the instruction pointer, direction, and memory
  333. pointer is copied from the creating thread; the call stack, on the other hand,
  334. is created empty.
  335. \todo{Some or all of the above should be moved.}
  336. The following example demonstrates how to print ``\verb"!"'' until a key is
  337. pressed, using two concurrent threads:
  338. \begin{verbatim}
  339. /==.==<==\
  340. | |
  341. /+++++++++++==&\==>===?!/==<<==#
  342. \+++++++++++\ |
  343. $==>==+++++++++++/ \==>==,==#
  344. \end{verbatim}
  345. \end{document}