comun_shell.md 58 KB

HIGHLY WORK IN PROGRESS

Comun Shell

version TODO, by drummyfish, released under CC0 1.0, public domain

This is a specification of comun shell, a simple universal abstract text-based computer interface for both humans and programs, made alongside the comun programming language but being a separate project, not part of the comun language itself.

Definitions

The following is a list of definitions to be used throughout this document:

  • CS: comun shell
  • text: sequence of ASCII characters
  • byte: value consisting of exactly 8 bits
  • shell program: program loaded under CS
  • numeral: non-empty string of decimal digits representing a number written in base 10, not starting with zero if it's more than one character long
  • <ARG_START>: : (colon, ASCII value 58)
  • <CMD_MAX_LEN>: 64
  • <CMD_START>: / (slash, ASCII value 47)
  • <CMD_START_SILENT>: \ (backslash, ASCII value 92)
  • <CMD_END>: <NEWLINE> (ASCII value 10)
  • <FILENAME_MAX_LEN>: 32
  • <EXTENSION_START_CHAR>: - (dash)
  • <SHELL_START_CHAR>: . (dot)
  • by default, initially: after CS startup
  • range from A to B: set of integers between A and B, including both A and B

Basics

CS mainly aims to:

  • Allow users interactive, unified way to operate computers of various kinds (with focus on simpler ones), i.e. for example inspect the computer hardware and capabilities, load and run different programs, inspect files on disk etc.
  • Serve as an I/O library/framework/platform for programs (both interpreted scripts and native, compiled programs), i.e. to offer programs API for interacting with peripherals of general types (displays, printers, ...) without having to know the exact details of the specific hardware model, to be able to make interactive programs, beyond mere number processors.
  • Be a basis of truly free, non-commercial, public domain, selfless technology.

For freedom simplicity is key, CS needs to be simple enough in order to make itself easily available on as many computers as possible, even for example on very weak ones that aren't capable of running a traditional operating system. While nowadays typical computing environments consist of very complex, separate parts such as an operating system (further subdivided into kernel, user space, drivers, modules etc.), shell, terminal emulator and I/O libraries (such as SDL, X11, OpenGL, SFML, ...), CS is all in one, greatly reducing the number of interfaces and simplifying computing. CS may be seen as a complete virtual computer which together with a programming language (such as comun) will provide a complete computing environment. CS communicates with humans and computers in practically the same way and makes very little distinction between the two.

CS does not include any scripting language, it is not meant to be Turing Complete, the main purpose is to be an abstraction layer. In this it differs from e.g. traditional Unix shells, but note that CS is designed so that it can very easily be extended and will allow to relatively simply create a more "user friendly" shell on top of it, for example as a simple shell program that will by default only hand its input to underlying CS, but will in addition be able to detect and execute additional more complex convenience commands.

Besides simplicity CS is supposed to be flexible, highly portable, extensible, platform independent and non-demanding on hardware. It is supposed to be manually usable even in its simplest form but its priority is NOT a comfortable use -- it will offer bare minimum functionality in form of basic built-in commands and leave comfortable features to be implemented as additional programs (in the "Unix" spirit). Most CS features, commands and devices are only optional so that a very minimal version of the shell can be made, even one that for example has no visual output (not even text display). Supported features can always be queried so that any program knows which features are available. CS only defines an interface and model of operation, implementation details are left to be chosen conveniently. It tries to make as few assumptions as possible so as to not discriminate against any platform -- it does for example NOT assume multitasking or existence of a file system. In environment with multitasking and interprocess communication CS can be a standalone program that will communicate with a program it is running e.g. through operating system pipes. In systems with network capabilities this can happen over a network interface. In environments without these capabilities, e.g. on bare metal platforms that will only run one program at a time, CS can behave like a library, i.e. it can be compiled into the program itself -- here, of course, only the capabilities that the program will need may be chosen to be compiled in, i.e. one may choose the specific features of the shell to compile into the program's internal shell (e.g. a program that will never utilize graphic screen may choose to compile in only a very bare version of CS that doesn't have graphics screen capability). Another possibility is for the shell to emulate or interpret programs that run inside it. An important thing is also to not burden programs that will run under CS, a programmer must not be penalized for using CS (he must not pay for what he doesn't use), i.e. non-CS programs don't have to be changed in any way in order for them to run under CS. Programs will only use standard input/output to communicate with CS, no extra communicating channels are imposed.

CS will however not aim to for maximum generality, it will assume a model of what would nowadays be considered a rather simple computer -- for example it will mostly assume CPU has just one core, that there is only one network interface and so on. This doesn't mean it's incompatible with more complex computers, just that it may not be able to utilize them to fullest potential.

CS will also not aim for safety beyond basic protection of hardware from physical damage, things such as fool proof design or encryption will not be considered.

Reasonable effort will be made to at least allow emulating basic concepts of Unix philosophy, for example letting program input and output be redirected to files and devices, as much communication as possible will be done through text protocols etc.

CS furthermore ensures high backwards compatibility for plain comun programs, i.e. a normal comun program written without CS in mind will in most cases continue to work under CS.

CS is not tied to the comun language, just developed in close relation to it; it is possible to use CS with programs written in other languages than comun, the only requirement is that the program has standard input and output of 8 bit bytes.

Whenever anything is not specified or allows different interpretations, a good CS implementation should try to make the decision that's best aligned with the philosophy described above and try to stay consistent with other things defined in this document.

Shell Operation

The following is a diagram showing the basic CS components and their connections:

                ______________________________________
               |                                      |
       user ---->    .                 .        .    ----> console
 filesystem ---->    .   comun shell   .        .    ----> display
    network ---->    :                 :        :    ----> filesystem
    pointer ---->    |             OET |    OEB |    ----> audio
   keyboard ---->    |                 |        |    ----> network
        ... ---->    |    _____     ___|__   ___|__  ----> ...
               |     |   |answ.|   | text | | bin. |  |
 input devices |     |<--|buff.|<--| proc.| | proc.|  | output devices
               |  IE |   |_____|   |______| |______|  |
               |     |                ^         ^     |
               |     | OS             |         |     |
               |_____|_______         '--\    --'     |
            _________|_____  |            \           |
           |         IP    | | IS          o DS       |
           |  shell      OP------------>---'          |
           |  program      | |________________________|
           |_______________|

CS has a number of slots for input devices (on the left) and output devices (on the right). Not all of these devices have to be present, only the ones that will be mentioned as mandatory are required. The devices are abstract so as to make them independent of any specifics of physical devices -- interfaces and behavior of these abstract devices will follow later on.

There are three important data channels: IE (external input), to which answer buffer is connected and to which input devices may additionally connect as well; then there are OET (external text output) and OEB (external binary output) which get connected to various output devices. Connection of these channels to external devices is handled by CS. All IE, OET and OEB channels are each connected to exactly one external device at any time. OET and OEB may be connected to the same device. By default IE is connected to user device, OET to console device.

Reconnecting a channel from one device to another (which may also be the same device) always comes with CS ensuring the safety of doing so, i.e. performing appropriate actions needed for proper disconnection and connection of a channel to a device. If a command defines that channel is disconnected from a device and connected to another device, the disconnect and connect is always performed, even if the devices in question are and the same device.

The default behavior of CS is to read bytes from the user (from user device, typically a keyboard) that goes through IE channel and OS interface to standard input of shell program (if present), whose standard output then follows through IS interface to command processor that will by default resend the data to the output channel out_e, which by default is connected to the console devices. If no shell program is loaded, OS interface is simply connected through to IS. So in the most basic scenario the user is inputting text lines which go through the data path and get printed to the output console.

In this data stream CS furthermore detects and processes commands of certain further specified format, with the help of text processor unit -- on detecting a command a special action may be performed (such as writing to a file, initializing display, reading pointer state, connecting a channel to a device etc.) and, if the command is defined to do so, a return string is potentially generated and pushed into answer buffer, from which it may later be read. In normal state command processor resends every byte it reads to out_e, only if it's in the state of reading silent command it doesn't resend them. Further details are given in the commands section.

The answer buffer unit holds answers (i.e. string return values) generated by commands. Size of this buffer must be such as to be capable of holding any possible answer that may be generated. If this buffer holds any data, IE will take data from the buffer first, i.e. reading from answer buffer takes precedence before reading data from the connected input device. Every answer must always be <CMD_END> terminated; the answer must contain no <CMD_END> characters except for the last one which has to be <CMD_END>. This behavior ensures that a shell program is always able to tell if it is reading from the answer buffer or not, i.e. immediately after issuing a command that by definition produces an answer will cause the answer to appear on shell program's input immediately, and after reading a <CMD_END> the program knows the answer has ended.

There is also binary processor through which data from IS will go if DS (output switch) is flipped (by default it is connected to text processor), which may happen in some situations. Binary processor serves to handle transfer of binary data directly to devices.

If a shell program is loaded, it serves as a filter; in this role the program may decide to take any amount of control, it may read or ignore the input, it may resend input to the output or discard it and generate its own output, insert its own commands etc.

A unit (either shell program or command processor) reading from OS when there are no bytes available for reading is paused and left waiting until the data become available.

Immediately after startup and initialization CS may print a message in the console device. This message may be specific to each implementation, it may for example inform the user that the shell has started, it may print its version, show a short guide on how to get help etc.

Immediately after startup CS may or may not have some shell program loaded.

Inside CS signals may be raised. Signal is a named global event that causes certain actions. Upon detecting a signal, CS must immediately stop transferring any data over IE and out_e and will try to realize the action associated with the signal as early as possible while ensuring reasonable level of safety of doing so (i.e. preventing corruption of disk etc.). Following signals and their numeric codes are defined:

  • 0: computer shutdown: Turns off the computer.
  • 1: computer restart: Restarts the computer (with physical shutdown).
  • 2: program unload: Unloads the shell program, connects OS to IS.
  • 3: I/O reset: Connects IE, OET and OEB to the devices to which they are connected by default (after CS startup).
  • 4: program restart: Performs the same action as program unload signal, then the same action as I/O reset signal, then loads and runs the unloaded program again.
  • 5: end of input: If any shell program is loaded, OS is disconnected from IP (i.e. the shell program will no longer receive any input and its query for end of input will be answered positively). Then IE will always be connected to user device.
  • 6: end of output: Connects OEB to the zero device, then flips DS to text processor.

CS maintains an internal status number which indicates errors that may occur, or lack of. The number always keeps the status of the latest performed command, or 0 if no command has been performed yet. In general value 0 signifies no error, other values mean an error; the type of the error is further specified by the value. Status number 1 will signify a general, further unspecified error (that fits in no other category), number 2 will signify unknown or unsupported command name; number 3 will signify bad command argument (e.g. no argument given to a command that expects one, wrong argument format, wrong or unsupported value passed etc.), 4 will signify that the command (and its arguments) were correct but the action couldn't be performed for some reason (e.g. couldn't write to a file because there isn't enough space left etc.), the rest of values up to 9 (including) are reserved and shouldn't be used. For more specific errors values 10 and above must be used and their construction should follow these rules: the lowest decimal digit shall signify the general category, as defined above, and digits towards the left will give more specific information. I.e. for example an status number of format XY3 will signify error related to arguments passed where Y has a further meaning hinting on what exactly was wrong with the argument and X may give yet more specific information. This makes it possible to easily extract at least a general meaning of an error if the significance of the specific code is not known, only using the modulo operator.

Devices

This section describes external devices CS can interact with, in alphabetical order. It describes a model of how a device must behave in order to be compatible with CS. Unless mentioned otherwise, presence of any device is optional. Each device has in brackets assigned a letter that will associate it with commands.

Audio (a) is a device capable of producing sound. It is initially turned off. When it is turned on, it starts playing audio continuously using a circular buffer of values (whose format depends on specified audio mode) that has total capacity of N samples, this size may only be set when audio mode is initialized and then mustn't change until another such initialization.

Console (c) is an output device capable of displaying the tail of text stream which may contain line breaks. This device is mandatory but it may be an empty device, i.e. one that behaves like the zero device. It is always considered to be turned on, it may just be disconnected from OET/OEB or obscured (e.g. if it shares screen with the display device). Console displays fixed width characters, at any given time each printed row is capable of holding the same maximum number of characters. Console may be for example the computer screen, virtual window or physical printer. As an input it takes a stream of bytes but it will only display printable text characters, it will ignore any other bytes sent to it, also ignoring characters that serve for deleting or modifying already printed text (as printers cannot do this), tab character must be handled in the same way as space character. The console may limit the displayed output only to latest N characters or lines, but if possible the console should be capable of displaying at least a whole answer (the string returned via answer buffer) that any command can possibly produce.

Display (d) is a graphical device capable of displaying a grid of pixels. Display supports one or more modes plus a turned off state which we will consider to be a special kind of mode. Resolution of the pixel grid and possible pixel colors depend on the display mode set (described further on). Display has an internal memory holding the state of pixels it is displaying -- state of this memory is immediately reflected on the screen. The format of the data in this memory is determined by the display mode. When a display is initialized into some mode, the state of its pixel memory is undefined until any data is transferred to it. Display is initially turned off. Physically the display may (or may not) be the same display that is used for the console device -- if this is the case, then if display is activated, it may obscure the console output; then if the display is turned off again, console will be displayed again and its old content may or may not have been discarded by the activation of the display.

Filesystem (f) is an input/output device which stores files whose content usually (but not necessarily always) persists between device restarts. Physically it may be for example magnetic disk, optical disk, flash drive etc. Because file systems used in practice differ in many ways (file name and size constraints, permissions, allowed directory structure etc.), it is impossible to ensure absolute compatibility with every single such system; CS must either ensure mapping of the underlying file system to the file system model defined here (e.g. translating non-ASCII file names to pure ASCII names, filtering out incompatible files etc.) or reject the underlying system as incompatible and not support this device. The CS implementation may also choose to offer access to some hardware through files (in the "Unix" way), though this is always to be greatly considered as hardware should rather be accessed through CS commands if possible. The CS file system model is following. There are two types of files: regular files and directories. Each regular file holds a finite number of bytes. Each directory holds finitely many named links to files; each link within a directory has a name which is a text not longer than <FILENAME_MAX_LEN> characters, consisting only of ASCII characters in range from 30 to 126 minus double quotes (ASCII value 34) and which is unique among all link names within the same directory. Each link also has a sequential number (starting with zero) within the directory that's determined by alphabetical ordering of the link names. At any given time CS is located in one directory called current directory; exactly one directory is the default directory which will be the current directory by default. If practically possible, each directory except the default directory should contain a link named .. that points to its parent directory, a directory that directly links to this directory and is one step closer to the default directory (but none of this is mandatory as we don't impose any concrete file system structure). From the point of view of CS the number, types and names of links in current directory may only change if CS either renames a link in it, deletes it, creates a new one or issues a directory refresh command; i.e. even if other processes running in parallel alongside CS are changing the links, CS must keep a snapshot of the directory state once from the time it first inspected the directory (on single process systems this may not be necessary of course).

GPIO (g, general purpose input/output) is an input/output device that can be used to communicate with external devices over a bus of certain bit width. The device consists of N bytes whose values are all 0 by default. Values of the bytes may be both changed and read either by CS commands or by a further unspecified external device connected to them.

Keyboard (k) is an input device that consists of buttons, each of which can be either pressed (value 1) or released (value 0) at any given time. Each key is identified by ASCII character that's usually associated with that key, i.e. numeric key zero is associated with ASCII character 0, space key with ASCII space character , escape key with ASCII escape code etc. ASCII value 0 mustn't be associated with any key. Letter keys are by default associated with lowercase letters; uppercase letters may be used e.g. to indicate that shift key is held during the press. For up, down, right and left arrow keys codes with decimal values 17, 18, 19 and 20 (in respective order) shall be used.

Light (l) is an output device that consists of N physical lights (e.g. LEDs or light bulbs). Each light that is present has to be capable of being at least turned on or off, but may additionally allow for changing its color and intensity. Assigning numbers to physical lights should adhere to the following rule: if light A is vertically above light B, light A will have lower number; for any pair of lights that are vertically on the same level the one more on the left should always have lower number.

Microphone (m) is an input device capable of capturing audio in real time.

Network (n) is an input/output device allowing communication with other computers and/or programs (which may or may not be a CS). For our purposes the model of a computer network is following. A network is a set of independently running programs -- nodes -- each one with unique address (in case of IPv4 for example this address may consist of IP address and port). Each node can send a message -- a sequence of bytes -- to any other node. Unreliable delivery is assumed in general, i.e. any change may happen to the message during transfer and delivery is not confirmed -- reliability is left to be implemented by the communicating programs. Every node has a receiving network buffer of certain size (which may differ between nodes) into which a received message is pushed and where it waits until being read by the node. If a message arrives that wouldn't fit into currently remaining free space in the buffer then as much of the message as will fit will be pushed into the buffer, the rest will be discarded.

Pointer (p) is a pointing input device typically used to control visual cursor, but may also be utilized in other ways. Physically it may be for example a mouse or trackball. The device controls a position of 2D point (cursor) whose position CS maintains internally (the cursor is not seen on the screen, it's up to the shell program to display it if needed). At any time cursor points to one pixel of the graphical screen, i.e. it has coordinates x and y which are the horizontal and vertical offsets from the top left screen corner expressed in pixels, as integers starting at zero. Initial cursor position after any display mode change is unspecified. The cursor must always be in the bounds of the screen, it will stop at the edge of screen when it should go beyond it. Pointer may also have a scrolling wheel whose rotation changes internal wheel position that's a number in range from 0 to 255 and which wraps around both ways. If the wheel isn't physically present, the value will simply never change. Furthermore pointer has a number of buttons, each of which may be pressed or released at any time -- here we will only assume left, right and middle buttons.

Signaler (s) device is an input device that for the user provides the option to raise signals, e.g. in case of losing control over the computer (e.g. getting the shell program stuck in an infinite loop etc.). Physically the device may be implemented e.g. as a special combination of keys pressed and held for certain time or a dedicated button (of course it is possible to generate signal automatically as well, e.g. with a hardware watchdog). The device may allow sending any defined signal.

User (u) is a device whose purpose is to read mostly textual commands from the user, but it may generally send any byte values and may also be utilized for feeding other data than that input by human user (e.g. data produced by another program, a kind of "automated user"). This device is mandatory but it may be present only as an empty device, i.e. behaving like the zero device. In general the device reads a stream of bytes from the external world. The device doesn't have to have data available for reading at all times (e.g. if the user is still typing). The interface for loading data from a human user is not specified precisely, but it is recommended to be implemented as follows: Make a user interface for inputting a line of text, i.e. a text field in which the user can write the text, possibly edit it (e.g. delete characters) and use other features (e.g. command history, tab completion, ...), and which he will confirm by entering newline character. Once confirmed, the line will be placed into a queue from which they will be handed over to CS when it asks for it.

Video (v) is an input device that can capture pictures. It may capture both static images or video, the latter simply being seen as the capability to capture images in very quick succession.

Zero (z) device is an input/output device which does nothing. Sending data to it just discards the data. Reading from it always produces reading zero bytes, without ever blocking the reader.

Commands

Unless mentioned otherwise, case-sensitivity is always assumed.

Commands are detected by the text processor unit mentioned above. The unit processes input byte stream and normally (but not always) only forwards them to OET unchanged. The following pseudocode describes behavior of the command processor in more detail:

state := default
command_len := 0

loop:
  c := input()

  if state != reading_silent:
    output(c)

  if state = default:
    if c = <CMD_START>:
      state := reading_command
      command_len := 0
    else if c = <CMD_START_SILENT>:
      state = reading_silent
      command_len := 0
  else:
    if command_len = 0 and (c = <CMD_START> or c = <CMD_START_SILENT>): # escaping
      output(c)
      state := normal
    else if c = <CMD_END>:
      process_loaded_command()
      state := default
    else if command_len < <CMD_MAX_LEN>:
      command_buffer_push(c)
      command_len := command_len + 1

From the algorithm it follows that a command string adheres to the following regular expression:

[<CMD_START><CMD_START_SILENT>]([^<CMD_END>]*)<CMD_END>

Only the first <CMD_MAX_LEN> characters of the bracketed substring are considered (presence of more characters is not an error, they are just ignored).

We define command name as the string starting with the second character of the command string and ending with the last character that is neither <ARG_START> nor <CMD_END>. If the command contains <ARG_START>, then the string starting with the first character after the first <ARG_START> character up until the second to last character of the whole command is the argument string. If no argument string is present, the command is considered to have an empty string as its argument, otherwise the argument string is the command's argument.

If in context of program arguments and return values we talk about numbers, they are implicitly supposed to be represented as numerals. If we talk about a list of numbers, we implicitly mean a space-separated list of numerals.

Command naming conventions: each command starts with a character c1 that puts it in a general group. If the command is an extension command not defined in this document (added as an extra feature by a specific CS implementation), c1 is <EXTENSION_START_CHAR>. Otherwise if the command is related to some device, c1 is the letter assigned to the device, otherwise c1 is <SHELL_START_CHAR>. Further on the following strings may follow to signify their explained function. d: dump, sending data from the device over IE; pd/pt push direct/text, sending data into the device over OEB in direct/text binary format; w: write, sending data to the device over OET. TODO

Any command is only executed and otherwise handled once it has been read as a whole (when process_loaded_command() is called). For every command name it hold that it either always generates an answer of that it never generates one. If a command generates answer, it first clears the answer buffer of all its content, i.e. if there still exist unread bytes in the answer buffer and new answer is generated, it overwrites the old one.

For passing binary data to devices (e.g. pixels to a display) it is possible to use one of these formats:

  1. text-encoded binary data: Binary data in a format that can be encoded only with printable, non-blank text characters. In this format each byte is encoded by two consecutive bytes A and B; the encoded by has lowest 4 bits identical to lowest 4 bits of A and highest 4 bits identical to lowest 4 bits of B. For example the symbols 1; represent the binary value 00011011. This mode exists so that it is possible to input binary data even with purely textual input devices, but it may be less efficient to work with.
  2. direct binary data: A stream of direct, unencoded byte values. This mode may be more efficient as it avoids encoding and decoding bytes, so it should be preferred by those who can make use of it (typically programs).

Input data escaping means that every character passing through OS which is either <CMD_START> or <CMD_START_SILENT> will be preceded by an added <CMD_START_SILENT> character. These extra added escaping characters are generally NOT counted in the size of the input data (e.g. a file size will be considered the same whether escaping is used or not). Escaping exists to provide safety of dumping any data directly to output without the data triggering any commands.

Mandatory commands: generally inclusion of any command in CS implementation is optional. If a command is explicitly mentioned to be mandatory, it has to be included. If any command that contains "push direct" substring in its name is implemented, then also its version where this substring is replaced with "push text" has to be implemented too, and vice versa (i.e. any implemented "push" command must always offer both methods of data transfer). For any command description below that contains the N numeric parameter it holds that if a command for some N is implemented, then also corresponding commands for all possible lower values of N have to be implemented as well.

Passing binary data will typically not be done through command parameters but rather by issuing a command that switches DS from text processor to binary processor, connects out_e to the device in question and data will then flow directly to the device. The binary processor handles this transfer, it will firstly handle translation of the binary format (if necessary) and secondly count the amount of data transferred (if this count is set, i.e. not infinite) -- when the expected number of bytes has been sent, binary processor raises end of output signal.

Generally if data of known size is being transferred directly to or from a device (e.g. frame data to display, binary data to a file, ...) and the transfer is interrupted, e.g. by an error or device disconnect, the remaining (yet unsent) data is still read from the source and then discarded. This is to ensure that a sender that's expected to send N bytes can just assume he can send N bytes without having to check for errors (in some data transferring modes it is even impossible to check for errors during the transfer). The error can be checked after the transfer is finished.

The following is a list of CS command names that may be supported:

  • .? help: Returns all supported command names separated by spaces. This command is mandatory. This command allows any program to query the computer's capabilities.
  • .a arguments: Returns a string that was passed to CS as an argument by that who ran it.
  • .c command: Allows executing commands of the underlying system (if such system exists). This command passes its argument as a command to the underlying system (for example a Unix shell, DOS prompt etc.) and returns the result. As details about underlying system aren't further specified, the CS implementation is just supposed to make reasonable translations, e.g. of the return value etc. Status number is set according to the outcome of the command in the underlying system -- again, way of translating error codes is left to CS implementation.
  • .cc command chain: Takes argument and splits it by semicolon character into substring parts s1, s2, s3, ... sn. Then executes command <CMD_START>s1<CMD_END>. If this command succeeded (i.e. status number indicates success), returns string scc:s2;s3;...;sn, otherwise returns an empty string and sets the status number to that which was set by the failed command. This command is meant to serve for executing several commands in a row, typically e.g. for setting program's input to be a file -- here we may issue a command to set a file as an input stream (upon which user would normally lose the ability to issue further commands) and subsequently loading given program.
  • .i system info: Returns general static information about current system as a sequence of text strings separated by spaces, each in format attribute:value where attribute and value are non-empty text strings that mustn't contain colon and may only be composed of printable, non-white characters. Each attribute may be present at most once (none is mandatory). Some of attributes that may appear are: hw (hardware platform, e.g. laptop, thinkpad_x200, ti_calculator, ...), os (operating system, e.g. none, gnu/linux, windows, ...), name (name specific to this individual computer), isa (instruction set, e.g. x86, arm, ...), ram (integer, approximate amount of physical RAM in bytes), freq (CPU frequency), cores (number of CPU cores), bits (CPU bit width), year (year of manufacturing), gpu (graphics processing unit model), uptime (number of seconds for which the system has been continuously running), shell (name and version of the shell).
  • .m system measures: Similar to si but returns dynamic measures related to current moment. Like with system info a space separated list of attribute:value pairs is returned. Some attributes that may appear are: cpu (average utilization of all CPU cores in percents), ram (amount of free RAM in bytes), temp (temperature, most relevant one if multiple sensors are present), light (amount of light, in percents).
  • .pff program flash file: Takes a file index numeral. The corresponding file must contain a program that can be flashed and directly run by the computer CS is running on. This program may or may not be using CS. This command flashes this program and restarts the computer so that the program starts running.
  • .plf program load file: Takes a file index numeral. The corresponding file must contain a program that can be loaded. This program is then loaded in the same way as with pln command. Arguments to the loaded program may be passed in the same way as with pln.
  • .pln program load name: If empty string is given as argument, returns a space-separated list of names of programs that that can be loaded with this command (they may typically be e.g. programs built into CS itself). Program names adhere to the same rules as file names. If argument is given, then its part up to the first space character or end of the string (not including the space) is considered the name of the program to load -- then program unload signal is raised and the new shell program (whose name was passed) is loaded and run. The part of argument from the first character after the space until the end is considered an argument that will be passed to the loaded shell program.
  • .pu, program unload: If shell program is loaded, its execution is stopped as early as possible while ensuring general safety of doing so (deallocating memory, closing files, ...) and it is unloaded.
  • .r raise: Raises signal whose number has been passed as a numeral.
  • .r32 system random 32: Returns a numeral representing a high quality random unsigned 32 bit number. The priority is high randomness rather than speed (if a program needs fast, lower quality pseudorandom numbers, it can always implement its own generator), i.e. if the computer is capable of generating truly random numbers, it should return such a number; if it's not capable of generating such numbers, at least a very high quality pseudorandom generator should be used, seeded e.g. with current time.
  • .s status: Returns a numeral representing the shell's internal status number immediately before this command was issued. This command is mandatory.
  • .ss status set: Takes one numeral, does nothing, sets status number to the passed number. This command is mandatory.
  • .t time: Returns a list of two numbers, first one is the standard Unix time (number of seconds since January 1, 1970), the second one is the number of milliseconds elapsed since the current shell program started running. The time of elapsed milliseconds is allowed to have an upper limit of 2^32 or higher, at which it overflows to zero. Any of these values may be replaced with - (dash) if the value is unknown.
  • .vNg variable N get: Returns the shell's *(N + 1)*th global variable value. If this command is implemented, then also the corresponding svNs and svNge commands have to be implemented too.
  • .vNge variable N get escaped: Same as svNg but escapes all returned characters. If this command is implemented, then also the corresponding svNg command must be implemented.
  • .vNs variable N set: N is a numeral, the command sets the shell's *(N + 1)*th internal global variable to the exact string that's passed as argument (maximum size of command argument therefore determines the size of this variable). Initial value of this variable is an empty string. If this command is implemented, then also corresponding svNg command must be implemented. Commands working with global variables can be used e.g. for communication between different programs or to implement aliases/macros.
  • .w wait: Pauses execution of shell program for number of milliseconds passed as a numeral.
  • ai audio info: Returns a list of two numbers, first one says how many samples there are left to be played in the audio buffer (for stereo one couple of left/right values counts as a single sample), the other number says the number of free samples that can be uploaded to the buffer. These numbers will always add to the total number of audio buffer samples.
  • am0 audio mode: off: Turns off the audio device, no audio that would be produced by this device will be played since issuing this command until turning it on again.
  • amS audio mode: Turns on the audio device, sets it in mode S (see audio mode definition), clears and initializes audio buffer and starts playing it.

  • av audio volume: Returns numeral saying the current audio volume immediately before performing this command, in percents. If empty string is passed as argument, nothing else happens, otherwise the argument must be a numeral that says the value in percents to which the volume will be set.

  • apd audio push direct: Takes a numeral N as an input, disconnects out_e from all devices and connects it to the audio buffer, then lets N bytes be sent to the audio buffer and finally disconnects out_e from audio buffer and connects it back to the console. N mustn't be greater than the number of bytes that audio buffer can currently take.

  • apt audio push text: Same as audio push direct but will read the data in text format.

  • apbPN audio play beep: Plays a simple beep, several different beeps may be offered under different values of N. P can either be p, in which case the program is paused until the audio has been played, or n, in which case the command doesn't pause the program, or empty string -- a general version of the command -- in which case this behavior is not specified (the implementation may choose what it deems best or what's available). If any version of this command is implemented, a corresponding general version must be implemented too (e.g. if abp0 exists, ab0 must exist as well).

  • ci console info: Returns two numerals, first saying the current number of console character columns, the second the current number of visible console rows. The second number may be replaced with - if there is no meaningful value (e.g. if the console is being printed on paper). Different values may be returned at different times (for example if the console is in a virtual resizeable window).

  • cpd console push direct: Behaves the same as fwd but writes to console device instead of filesystem, also the first argument (file index) is not given to this command (i.e. this command takes at most one argument).

  • cpt console push text: Same as cwd but reads the binary data in text encoded format.

  • cw console write: Connects OET to console device.

  • di display info: If display is off, returns an empty string, otherwise returns picture format string corresponding to the currently set display mode, then space, then pixel order string. For details see the definition of screen mode strings.

  • dm0 display mode: off: Turns off the display device.

  • dmS display mode: Turns the display on and initializes it to given mode that corresponds to picture format string S.

  • dpd display push direct: For this command to work display must not be turned off. The command disconnects out_e from all input devices, then connects it to the display device's pixel data feed and initializes transfer of a new image frame. Now display will read direct binary data from out_e, as many bytes as is needed for drawing one whole frame, redrawing the displayed picture as the bytes are read. After this out_e is disconnected from the display and is connected to console device.

  • dpt display push text: Same as dp but will read the binary data in text-encoded format instead.

  • dr display read: Connects IE to display device and sends its current pixel data over the channel. The display must be turned on. The format and size of data depends on currently set mode, the data will be sent in the same format as is used with dpd.

  • fs filesystem space: Returns a list of two numbers, the first one is (at least approximate) free space left in the filesystem, the second number is (at least approximate) total capacity of the disk, both in bytes. In case of a more complicated situation, e.g. the filesystem being split into partitions of different sizes, the numbers returned shall be related to current directory (i.e. for example the sizes related to the partition CS currently resides in) if possible. Any of these numbers may be replaced with - if it cannot be reported.

  • fl filesystem list: If argument is empty string, returns number of file links in current directory as a numeral. If numeral argument is given, it is considered a file link index and a string of following items separated by spaces is returned: file name between double quotes, type of the file as a character (r for regular file, d for directory), numeral saying the byte count of the file content, Unix timestamp of file creation, Unix timestamp of last file modification. Numeric values may be just approximate if it would be very impossible or impractical to get exact values. Any numeric value that cannot be determined even approximately will be replaced by dash.

  • fd filesystem dump: Takes a file link index as numeral, then connects IE to filesystem device, opening the corresponding file, and sends the file content by the IE channel, then end of input signal is raised. Typical use case for this command is to provide input for shell program from a file instead from a user, in a way that's transparent to the shell program.

  • fde filesystem dump escaped: Same as fd but escapes all characters in the file content.

  • fds filesystem dump size: Same as fd but additionally returns a numeral saying the number of bytes that are transferred, and end of input signal is NOT raised. This is an alternative to fd command, typically used by shell programs that are aware of CS which additionally also need to read from other devices or files.

  • fdes filesystem dump escaped size: Same as file dump escaped but additionally returns a numeral saying the number of bytes that are transferred, without counting the extra escaping characters, and end of input signal is NOT raised.

  • fw filesystem write: Takes file index numeral as argument, then opens the corresponding file for writing (deleting any content of that file if there was any), then OET is connected to filesystem device and all bytes coming through this channel will be written to the opened file. Writing will be terminated when OET is disconnected from filesystem, typically when end of output signal occurs (typically issued with a command). If any error happens during writing (for example file failed to be opened or if the filesystem ran out of space during writing), the file is closed, status number is set and OET is connected to zero device. This command serves for writing data to files without knowing the size in advance and for being able to issue other commands while writing this data, i.e. this may typically be used to e.g. redirect a shell program's log output from console to a file while still enabling the program to issue commands and work normally.

  • fpd filesystem push direct: Behaves the same as fw command with the following differences. An additional number may be given as argument, saying the number of bytes to transfer -- if this argument wasn't given, zero is assumed. The command switches DS to binary processor and sets its byte count to that received as argument, with zero signifying infinity. This number of bytes is then transferred in direct binary format. Then end of output signal is raised. Note that here it's not possible to issue commands during the data transfer, i.e. this command may be used e.g. for debugging (dumping raw program output into a file). This command may also be used by programs to write to files while still keeping OET directed to a desired channel.

  • fpt filesystem push text: Same as fwd but reads the input binary data in text encoded format.

  • fwa filesystem write append: Same as fw command but doesn't delete the content of the file that's opened for writing; instead the new data is appended at the end of the file.

  • fpad filesystem push append direct: Same fwd but appends the data in the same way as file append command.

  • fpat filesystem push append text: Same as fwt but appends the data in the same way as file append command.

  • fn filesystem new: Takes as an input a file name that will be valid within the current directory and creates a new empty file with that name.

  • fr filesystem remove:

  • fh filesystem highlight: Takes a file link index as numeral and marks the file for some special operations that may follow in the future.

  • fc filesystem copy: Takes a file link index as decimal number and copies into it the content of a file previously marked with file highlight command.

  • fm filesystem move: Same as file copy but also deletes the highlighted file afterwards. (This command exists because it can typically be implemented efficiently just as renaming the original file.)

  • fg filesystem go: Takes a file link index as numeral, which must point to a directory, and changes current directory to the pointed directory.

  • fu filesystem update: Updates the current directory information.

  • kg keyboard get: Returns current state of keyboard as a text string formed by concatenating characters associated with all pressed keys. Each character may be presented at most once and they will be ordered in ascending order by the ASCII values of the characters.

  • pg pointer get: Returns a string in format X Y W B where X and Y are numerals representing the current cursor coordinates, W is the pointer device wheel position as a numeral and B is a string saying which pointer buttons are pressed by containing a specific letter for each button that is pressed (being an empty string if no buttons are pressed) -- l, r, m signify the left, right and middle button respectively.

  • ps pointer set: Sets coordinates of the pointer cursor to those passed as a list of two numbers (representing x and y coordinates). This results in the cursor being moved to the specified coordinates.

  • lc light color: Takes either three numerals R, G and B, or just one numeral I which will be considered the same as passing three numerals I, I and I. Each of the numerals will be in range from 0 to 255, 0 signifying minimum intensity, 255 maximum intensity. R, G and B specify the intensity of red, green and blue color components. All lights will be set to the color and overall intensity that's closest to the specified color. Values 0, 0, 0 will turn off the LEDs completely.

  • lNc light N color: N is a numeral, the command behaves the same as lc but applies the color change only to *(N + 1)*th light. If this command is implemented, then also lc must be implemented.

  • nat network address to: Sets the network address of destination node (node to which network messages will be sent) to that specified by argument. Format of the address depends on type of network that is used; with IPv4/IPv6 networks the address will consist of two strings separated by space: first the IP address (traditional text format of IPv4/IPv6) or a URL (if DNS support is available), them port as numeral.

  • naf network address from: Sets the network address of the node on which this CS is running to that specified by argument. Format of the address is the same as for network address to; with IPv4/IPv6 there is additional possibility of using dash (-) for the first part of address string to signify that the IP address isn't to be changed (as it is many times already be assigned e.g. by DHCP).

  • nd network dump: Receives a message from network device, the semantics is similar to the file dump command, but this command doesn't generate end of file event. This command connects IE to network device and then receives bytes from it indefinitely, blocking the reader if there are no data to be read.

  • nde network dump escaped: Same as nd but the data read will be escaped.

  • nds network dump size: Connects IE to network device and transfers all bytes from the network buffer by this channel, returns the number of bytes to transfer. Does NOT raise end of input signal.

  • ndes network dump escaped size: Same as nds but will additionally escape bytes like the nde command. The returned number of transferred bytes does not include the extra escaping characters.

  • nw network write: Connects OET to network device, then all bytes coming through this channel will be sent over the network to the destination node. Writing to network ends when OET is disconnected from the network device, typically when end of output signal is raised. CS may hold the bytes to be sent in an output buffer and choose a strategy of when to send them (e.g. when the buffer is full or in certain time periods), but once writing has finished (OET was disconnected from network) all the data must be sent as soon as possible. Reliable delivery is never guaranteed so failing to deliver the message is not considered an error.

  • npd network push direct: Behaves the same as fpd but writes to network device instead of filesystem device. The strategy of when to physically send the data is chosen by CS, but if possible, any data written to network this should be sent in 1 second or sooner.

  • nh network hint: Asks CS to handle network transfers in certain ways. The command takes a text argument, CS reads it character by character and on reading any of the following character performs the action specified for the character: r (reliable) gives priority to reliable delivery (i.e. try as much as possible to deliver characters exactly as sent, preserving exact values, order etc.), u (unreliable) gives low priority for reliable delivery (allowing to prioritize other things such as latency), f (fast) gives priority to low latency (send any data written to network as soon as possible), s (slow) gives low priority to latency, d (default) nullifies any hints given up to this point (allows CS to set priorities as it deems best). Other characters than defined here are ignored.

  • npt network push text: Same as network npd but takes the binary data in text encoded format.

  • gg GPIO get: If empty string is given as argument, returns the number of bytes of the GPIO device. If numeral X is given as argument, the current value of *X + 1*th byte is returned as a numeral.

  • gs GPIO set: Takes two numerals, X and V, sets the value of *X + 1*th byte to V.

  • vrS video read: Connects IE to video device, captures a picture and transfers it in picture format S, then connects IE to user device.

  • vd video dump: TODO (FPS and so on)

  • zc zero connect: Connects OET to zero device.

  • zd zero dump: Connects IE to zero device.

If there occurs any error when executing a command that generates an answer, empty string is returned.

Picture format string specifies resolution, possible colors and data encoding of a picture, for example a display mode. Its format is WxHxB or WxHxBxI. W and H are numerals saying horizontal and vertical resolution in pixels, respectively. B is numeral saying number of bits per pixel. If the xI part is not present, direct color mode is implied, otherwise indexed mode is implied with I bits for each palette pixel. Direct mode means each pixel holds a direct color value. Indexed mode means each pixel holds an index to a color palette which precedes the data of the picture pixels. Pixel value is represented by the smallest number of bytes that can hold the number of bits used for color (B in direct mode, I in indexed). The bytes of a pixel start with the least significant one (i.e. lowest byte will appear first). Let the number these bytes form be called P. Lowest bits will be used for the blue component, the next lowest bits for green and the next lowest bits for red. Let R, G and B be the number of bits that will be used for red, green and blue color component respectively. Let x be B / 3 (bottom rounded to integer) and let y be *B - 2 * x*. R will be equal to x. If y is greater than x, then G will be equal to y and B to x, otherwise G will be equal to x and B to y. For a picture in direct mode *W * H * ceil(B / 8)* bytes will be needed, for indexed mode this amount plus *ceil(P / 8) * 2^B* will be needed.

Pixel order string says in what order pixels of a picture appear in the picture, possible values are rb (left to right, then top to bottom), lb (right to left, then top to bottom), rt (left to right, then bottom to top), lr (...), br (bottom to top, then left to right), bl (...), tr (...), tl (...).

Audio format string specifies how audio data is encoded. Its format is WxF or WxFxC where W, F and C are numerals. If the xC part is not present, it is assumed to be x1. W says the number of bits of a single sample for single channel, F says the audio frequency and C number of channels (i.e. 1 is mono, 2 is stereo etc.). Audio data in given format will be a sequence of values v, each of which holds C samples (one for each channel), i.e. v will communicate *W * C* bits. Value v is composed of the minimum number of bytes that are necessary for this; if v has more than 1 byte, lower bytes will appear first in the data stream. Lowest W bits of v encode the sample for zeroth channel, next lowest W bits encode the sample for first channel etc. Channel numbers will correspond to physical speakers by this rule: more important speakers will have lower channel number and if multiple speakers have the same importance, then for any two of these speakers the one more on the left will have lower channel number.

TODO:

  • joystick
  • function for setting a single pixel (slow but simple and can be useful)
  • (graphical) printer, ...? webcam/microphone will just transfer the data in the same way as data is transferred to display/audio? microphone dump could behave the same as network dump
  • maybe add some special character to dangerous commands (like file remove) to prevent writing them my mistake
  • filesystem structure:
    • mounted disks as special directories?
  • EEPROM device ("longterm"?)
  • order of signals, codes of signals
  • network status command?
  • more details on error codes

Examples