pgimeno
/
esofiles


			
							123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554
							\documentclass[a4paper]{article}

\makeatletter

%\newenvironment{dashdescription} {\list{}{\labelwidth\z@
%\itemindent-\leftmargin \let\makelabel\dashdescriptionlabel}} {\endlist}
%
%\newcommand*\dashdescriptionlabel[1]{\hspace\labelsep \normalfont\bfseries
%#1---}
%
%\makebox[\textwidth]{\hrulefill}

\newcommand\comment[2]{\begin{description} \item[#1] #2 \end{description}}

\newcommand\rationale[1]{\comment{Rationale:}{#1}}

\newcommand\note[1]{\comment{Note:}{#1}}

\newcommand\issue[1]{{\comment{\textit{Issue:}}{\it #1}}}

\newcommand\todo[1]{{\comment{\textit{Todo:}}{\it #1}}}

\makeatother

\title{\textsc{SNUSP} 1.0 Language Specification\\Working Draft 1}

\author{Daniel Brockman}

\begin{document}

\maketitle

\tableofcontents


%=============================================================================


\section{Introduction}

The \textsc{SNUSP} language was created in September, 2003 to develop a
complete and utter fucking waste of time.  (The name \textsc{SNUSP} is a
recursive acronym for ``\textsc{SNUSP}'s Not \textsc{Unix}, but Structured
\textsc{Path}.'')  We are currently evaluating the possibilities of developing
a \textsc{SNUSP} operating system kernel.  However, variants of the
\textsc{SNUSP} system, which use the \textsc{Linux} kernel, are already in
use; though these systems are often referred to as ``\textsc{Linux},'' they
are more accurately called ``\textsc{SNUSP}/\textsc{Linux} systems.''

\issue{This is not even funny.  How do you write an introduction to something
like this?}


\subsection{History}

One rainy night in August, 2003, Francis Rogers was sitting in his apartment
in [where his lives] experimenting with \textsc{C}.  Inspired by the
remarkably beautiful and symmetrical eight-instruction classic
\textsc{Brainfuck}, as well as the crazy multi-dimensional stack-shuffling
language \textsc{Befunge}, he was writing an interpreter for a language he
would later come to call \textsc{Path}.  Borrowing the basic instructions and
linear memory model from \textsc{Brainfuck}, and the two-dimensional code
space from \textsc{Befunge}, he created a language both simple to understand
and simple to use.  Once Rogers realized that he had created something
interesting---that is, once he got the interpreter to run a
spectacular bell-emmitting program---he immediately posted the source code and
a quick rundown on the language to the Something Awful
Forums\footnote{\texttt{<http://forums.somethingawful.com/>}} for peer review.

\textsc{Path} was highly appreciated as a respectable middle-ground by
everyone who adored \textsc{Brainfuck} but was scared by \textsc{Befunge} (or
vice versa), and a few who seemed new to programming but decided to pick up
\textsc{Path} because it looked so cute.  Not very surprisingly, everyone else
thought the language looked horribly obfuscated, and immediately started to
question the sanity of everyone involved with its development.  Nevertheless,
several \textsc{Path} tools created by enthusiasts popped up over the course
of a week: interpreters written in \textsc{C}, \textsc{C++}, \textsc{Perl},
and \textsc{Java}; debuggers for \textsc{Tk}, \textsc{Windows}, and
\textsc{Swing}; and a simple web-based interpreter interface written in
\textsc{PHP}.

The original \textsc{Path} was not perfect, however, and as more suggestions
for improving \textsc{Path} were made and implemented by the interpreter
writers, the language borders started to blur.  Every \textsc{Path} coder had
his own flavour---\textsc{SNUSP} was one of the more well-defined ones---but
noone could say what \textsc{Path} really was anymore.  To sort out this mess,
Rogers announced that he wished to keep the name ``\textsc{Path}'' for his
original version of the language, and asked everybody who wanted changes to
fork off under a new name (no pun intended).  This opened the door for the
\textsc{SNUSP} project to begin serious work on defining a completely
independent new language derived from traditional \textsc{Path}.


\subsection{Goals}

The \textsc{SNUSP} language, with its roots in \textsc{Path}, is intended to
be an \ae sthetically pleasing, modular language with an orthogonal
instruction set and a bright future.  This specification defines three
increasingly sophisticated levels of the \textsc{SNUSP} language:

\begin{description}

\item[\textsc{Core SNUSP}] is---like traditional \textsc{Path}---essentially a
modification of \textsc{Brainfuck} to use a two-di\-men\-sion\-al code space;

\item[\textsc{Modular SNUSP}] is an extension of \textsc{Core SNUSP}, adding a
subroutine mechanism; finally,

\item[\textsc{Bloated SNUSP}] is an extension of \textsc{Modular SNUSP},
adding support for indeterminism, concurrency, and a second data memory
dimension.

\end{description}

The first and second levels are theoretically complete; it it unlikely that
future versions of this specification will alter them.  The third level, on
the other hand, is specifically designated for new features---particularly
ones that add bloat.

Plans exist on developing a standard library in \textsc{Modular SNUSP}, with
the goal of increasing the viability of \textsc{SNUSP} as a development
platform for mission-critical applications.  It will factor out certain basic
building blocks and provide subroutines for mathematical functions, string
manipulation, etc.


%=============================================================================


\section{Memory}

There are three kinds of run-time memory in \textsc{SNUSP}:

\begin{description}

\item[code space] contains run-time representations of program source;

\item[data memory] (or simply ``memory'') contains integers that are
accessed and modified by \textsc{SNUSP} programs when carrying out their task;
finally,

\item[the call stack] (used in \textsc{Modular SNUSP}) is, in familiar terms,
a FILO queue storing the return addresses of subroutine calls, i.e.,
\textbf{enter} instructions.

\end{description}


\subsection{Memory Units}

Code space and data memory are both two-dimensional and made up of units
called, respectively, \emph{code cells} and \emph{data cells}.  The call stack
is one-dimensional and made up of \emph{stack frames}.

\note{The second data memory dimension can be exploited only by programs
written in \textsc{Bloated SNUSP}.  In lower levels of \textsc{SNUSP}, data
memory is effectively one-dimensional, since the data pointer can only move in
two opposite directions---\textbf{left} and \textbf{right}.}

\note{The term ``stack frame'' normally refers to both the return address and
the local data of a subroutine.  However, in \textsc{SNUSP} there is no such
thing as ``local data,'' and return addresses are completely separated from
data memory.  As a practical convention, most subroutines guarantee the
invariance of previous memory; but since the language does not actually define
subroutines, there is nothing to enforce this.}


\subsection{Accessibility}

Unlike in \textsc{Befunge}, code space is completely inaccessible for
inspection or change by \textsc{SNUSP} programs; it is only used internally by
the interpreter.  Thus, once the interpreter has loaded a program, code space
does not change until another program is loaded.

Data memory, on the other hand, is completely accessible to \textsc{SNUSP}
programs as mutable working storage---just like in \textsc{Brainfuck}.

The call stack is accessible to \textsc{SNUSP} programs as a side-effect of
the \textbf{enter} and \textbf{leave} instructions.  However, it cannot be
randomly accessed.


\subsection{Limitations}

The following limitations apply to the three memory sections:

\begin{itemize}

\item Code space is bounded in all directions, and it is impossible for the
instruction pointer to point outside it.

\item Data memory can grow as large as physical memory restrictions allow it
to.  However, it is bounded in both dimensions:  If at any point the number of
times the data pointer has been moved to the left exceeds the number of times
it has been moved to the right, the resulting behavior is undefined.  The
equivalent is true for the orthogonal dimension: The number of moves upwards
must not exceed the number of moves downwards.

\rationale{This does not practically impose a limit on normal \textsc{SNUSP}
programs, but simplifies the implementation of interpreters.}

\issue{This is the most obvious irregulatity that I know about in the SNUSP
language.  Should we define what happens if the data pointer falls off?  We
have three choices:

\begin{itemize}

\item Leave it undefined.  This leaves a hole in the language, but maybe this
is the way it should be.

\item Define the behavior.  Terminating the process seems to be the only
reasonable choice here, but it is not elegant.

\item Remove the boundaries altogether, eliminating the issue.  This seems
to be the most elegant solution.  Can you live with this, interpreter writers?

\end{itemize}}

\item The call stack is unbounded and can grow as high as physical memory
limitations allow it to.

\end{itemize}


%=============================================================================


\section{Syntax}

\textsc{SNUSP} source files are read and transplanted into code space one line
at a time.  A conforming \textsc{SNUSP} interpreter is required to recognize
all of the following character sequences as end-of-line indicators:

\begin{itemize}

\item carriage return (13), line feed (10)

\item carriage return (13)

\item line feed (10)

\end{itemize}

Further, when loading a source file, conforming interpreters must behave as if
all lines were padded to the right with spaces (32), so as to make all lines
equally long.


\subsection{Instruction Characters}

When each line is read into code memory from the source file, the source
characters are translated to instructions according to the following table:

\begin{center}\begin{tabular}{|ccc|}

\hline

\textsc{ASCII} & Glyph & Instruction \\

\hline \hline

\multicolumn{3}{|c|}{\textsc{Bloated SNUSP}} \\

37 & \verb"%" & \textbf{rand} \\

38 & \verb"&" & \textbf{split} \\

59 & \verb";" & \textbf{down} \\

58 & \verb":" & \textbf{up} \\

\hline

\multicolumn{3}{|c|}{\textsc{Modular SNUSP}} \\

64 & \verb"@" & \textbf{enter} \\

35 & \verb"#" & \textbf{leave} \\

\hline

\multicolumn{3}{|c|}{\textsc{Core SNUSP}} \\

62 & \verb">" & \textbf{right} \\

60 & \verb"<" & \textbf{left} \\

43 & \verb"+" & \textbf{incr} \\

45 & \verb"-" & \textbf{decr} \\

44 & \verb"," & \textbf{read} \\

46 & \verb"." & \textbf{write} \\

47 & \verb"/" & \textbf{ruld} \\

92 & \verb"\" & \textbf{lurd} \\

33 & \verb"!" & \textbf{skip} \\

63 & \verb"?" & \textbf{skipz} \\

\hline

32 & \verb" " & \textbf{noop} \\

61 & \verb"=" & \textbf{noop} \\

124 & \verb"|" & \textbf{noop} \\

\hline

\end{tabular}\end{center}

All other characters translate to \textbf{noop} instructions.


\subsection{The Starting Indicator}

The \emph{starting indicator} tells the interpreter where to begin execution.
If the source file contains any dollar signs (36), the first one to appear is
the starting indicator; otherwise, the first character---whatever it may
be---is the starting indicator.


%=============================================================================


\section{Execution}

A \textsc{SNUSP} program may be executed indirectly through an interpreter, or
directly as a stand-alone process with a built-in interpreter.  In any case,
when a \textsc{SNUSP} program is invoked, there is no way to pass arguments to
it; the only way to give it input it is through the standard input stream.
The program, however, can give output---apart from through the standard output
stream---via the process exit code.


\subsection{Variables}

During execution three variables are used to keep track of the program state,
apart from the various kinds of memory:

\begin{description}

\item[the instruction pointer] that points to an instruction in code space
called the \emph{current instruction},

\item[the data pointer] that points to a cell in data memory called the
\emph{current data cell}, and

\item[the current direction] that indicates direction in which the instruction
pointer is moving.

\end{description}


\todo{Maybe add a section about threads here.}


\subsection{Ticks and Turns}

At the start of execution, a thread is created, its instruction pointer is set
to point to the cell that contains the starting indicator, and its current
direction is set to \textbf{right}.  Its call stack starts out empty and the
data memory originally contains nothing but zeroes.

Execution of a \textsc{SNUSP} program is then carried out in small steps
called \emph{ticks}.  Each thread gets one \emph{turn} per tick, but the order
in which the turns are taken is undefined.  The thread that is currently
taking its turn is called the \emph{active thread}.  A turn proceeds as
follows:

\begin{enumerate}

\item The current instruction is carried out.

\item The instruction pointer is moved one step in the current direction
unless this would cause the instruction pointer to point outside code space,
in which case the active thread is \emph{stopped}.

\end{enumerate}

When a thread is stopped, all its resources are released and it ceases taking
turns.  When all threads are stopped, the process terminates with the exit
code set to the value of the current memory cell of the last thread to take a
turn.


%=============================================================================


\section{Instructions}

All instructions in \textsc{SNUSP} are atomic, in the sense that there are no
real syntactic or semantic restrictions on how they are to be combined.  Some
instructions access and/or mutate the current memory cell, but no other parts
of data memory are ever touched.

The \textbf{noop} instruction is special, as it actually denotes \emph{lack}
of any instruction at all:

\begin{description}

\item[noop] (\verb" ", \verb"|", \verb"=") Do nothing.

\end{description}


\subsection{\textsc{Core SNUSP}}

The first six instructions in this set---\textbf{left}, \textbf{right},
\textbf{incr}, \textbf{decr}, \textbf{read}, and \textbf{write}---are
identical to their \textsc{Brainfuck} counterparts.  The remaining
four---\textbf{ruld}, \textbf{lurd}, \textbf{skip}, and
\textbf{skipz}---replace the pair of looping instructions found in
\textsc{Brainfuck}---\verb"[" and \verb"]"---as general-purpose flow control
instructions that can be combined to create loops and similar code structures.

\begin{description}

\item[left] (\verb">") Move the data pointer one cell to the left.

\item[right] (\verb"<") Move the data pointer one cell to the right.

\item[incr] (\verb"+") If the value of the current data cell is less than the
maximum allowed value, increment it; otherwise, set it to zero.

\item[decr] (\verb"-") If the value of the current data cell is greater than
zero, decrement it; otherwise, set it to the maximum allowed value.

\item[read] (\verb",") Read a byte from standard input and put it in the
current data cell.  If the input stream is exhausted, block until more data
becomes available.

\item[write] (\verb".") If the value of the current data cell is
representable by a single byte, write this byte to standard output.
Otherwise, the behavior is implementation-defined.

\issue{Ruling run-time errors out, there are a number of different methods for
squeezing a 32-bit value into a byte: \begin{itemize} \item doing it modulo
the maximum value, \item outputting zero, and \item outputting the maximum
value. \end{itemize}  Should we choose one of these?}

\item[ruld] (\verb"\") If the current direction is \begin{itemize} \item
\textbf{left}, change it to \textbf{up} \item \textbf{right}, change it to
\textbf{down}, \end{itemize} and vice versa. (Mnemonic:
right$\Longleftrightarrow$up, left$\Longleftrightarrow$down)

\item[lurd] (\verb"/") If the current direction is \begin{itemize} \item
\textbf{left}, change it to \textbf{up} \item \textbf{right}, change it to
\textbf{down}, \end{itemize} and vice versa.  (Mnemonic:
left$\Longleftrightarrow$up, right$\Longleftrightarrow$down)

\item[skip] (\verb"!") Move the instruction pointer one step in the current
direction.

\item[skipz] (\verb"?") If the value of the current data cell is zero, move the
instruction pointer one step in the current direction; otherwise, do nothing.

\end{description}


\subsection{\textsc{Modular SNUSP}}

This level adds two additional instructions, which provide the means for
implementing subroutines in \textsc{SNUSP}.

\begin{description}

\item[enter] (\verb"@") Push the current direction and instruction pointer to
the call stack.

\item[leave] (\verb"#") If the call stack is empty, stop the active thread;
otherwise, pop the topmost stackframe, set the current direction and
instruction pointer to the values recieved from the stack, and move the
instruction pointer one step in the current direction.

\end{description}

The following example demonstrates how to implement a subroutine called
\verb"ECHO", using the \textbf{enter} and \textbf{leave} instructions, and how
to call it twice from the main program execution path:

\begin{verbatim}

       /==!/======ECHO==,==.==#
       |   |
$==>==@/==@/==<==#

\end{verbatim}


\subsection{\textsc{Bloated SNUSP}}

This level adds four new instructions, for a grand total of sixteen
\textsc{SNUSP} instructions.  The first two simply add ways of moving through
the second data memory dimension; this is particularly useful in the context
of concurrency, which is provided by another instruction for starting new
threads.  The last instruction provides a way to obtain random numbers in
arbitrary ranges.

\begin{description}

\item[up] (\verb":") Move the data pointer one cell upwards.

\item[down] (\verb";") Move the data pointer one cell downwards.

\item[split] (\verb"&") Create a new thread, and move the instruction pointer
of the old thread one step in the current direction.

\item[rand] (\verb"%") Set the value of the current data cell to a random
number between zero and the current value of the cell, inclusive.

\end{description}

All threads share a single code space and a single data memory; however, each
thread has its own instruction pointer, direction, memory pointer, and call
stack.  Upon thread creation, the instruction pointer, direction, and memory
pointer is copied from the creating thread; the call stack, on the other hand,
is created empty.

\todo{Some or all of the above should be moved.}

The following example demonstrates how to print ``\verb"!"'' until a key is
pressed, using two concurrent threads:

\begin{verbatim}

                    /==.==<==\       
                    |        |       
     /+++++++++++==&\==>===?!/==<<==#
     \+++++++++++\  |                
$==>==+++++++++++/  \==>==,==#       

\end{verbatim}

\end{document}