1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258125912601261126212631264126512661267126812691270127112721273127412751276127712781279128012811282128312841285128612871288128912901291129212931294129512961297129812991300130113021303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348134913501351135213531354135513561357135813591360136113621363136413651366136713681369137013711372137313741375137613771378137913801381138213831384138513861387138813891390139113921393139413951396139713981399140014011402140314041405140614071408140914101411141214131414141514161417141814191420142114221423142414251426142714281429143014311432143314341435143614371438143914401441144214431444144514461447144814491450145114521453145414551456145714581459146014611462146314641465146614671468146914701471147214731474147514761477147814791480148114821483148414851486148714881489149014911492149314941495149614971498149915001501150215031504150515061507150815091510151115121513151415151516151715181519152015211522152315241525152615271528152915301531153215331534153515361537153815391540154115421543154415451546154715481549155015511552155315541555155615571558155915601561156215631564156515661567156815691570157115721573157415751576157715781579158015811582158315841585158615871588158915901591159215931594159515961597159815991600160116021603160416051606160716081609161016111612161316141615161616171618161916201621162216231624162516261627162816291630163116321633163416351636163716381639164016411642164316441645164616471648164916501651165216531654165516561657165816591660166116621663166416651666166716681669167016711672167316741675167616771678167916801681168216831684168516861687168816891690169116921693169416951696169716981699170017011702170317041705170617071708170917101711171217131714171517161717171817191720172117221723172417251726172717281729173017311732173317341735173617371738173917401741174217431744174517461747174817491750175117521753175417551756175717581759176017611762176317641765176617671768176917701771177217731774177517761777177817791780178117821783178417851786178717881789179017911792179317941795179617971798179918001801180218031804180518061807180818091810181118121813181418151816181718181819182018211822182318241825182618271828182918301831183218331834183518361837183818391840184118421843184418451846184718481849185018511852185318541855185618571858185918601861186218631864186518661867186818691870187118721873187418751876187718781879188018811882188318841885188618871888188918901891189218931894189518961897189818991900190119021903190419051906190719081909191019111912191319141915191619171918191919201921192219231924192519261927192819291930193119321933193419351936193719381939194019411942194319441945194619471948194919501951195219531954195519561957195819591960196119621963196419651966196719681969197019711972197319741975197619771978197919801981198219831984198519861987198819891990199119921993199419951996199719981999200020012002200320042005200620072008200920102011201220132014201520162017201820192020202120222023202420252026202720282029203020312032203320342035203620372038203920402041 |
- @c -*-texinfo-*-
- @c This is part of the GNU Guile Reference Manual.
- @c Copyright (C) 2008-2011, 2013, 2015, 2018, 2019, 2020, 2022
- @c Free Software Foundation, Inc.
- @c See the file guile.texi for copying conditions.
- @node A Virtual Machine for Guile
- @section A Virtual Machine for Guile
- Enough about data---how does Guile run code?
- Code is a grammatical production of a language. Sometimes these
- languages are implemented using interpreters: programs that run
- along-side the program being interpreted, dynamically translating the
- high-level code to low-level code. Sometimes these languages are
- implemented using compilers: programs that translate high-level
- programs to equivalent low-level code, and pass on that low-level code
- to some other language implementation. Each of these languages can be
- thought to be virtual machines: they offer programs an abstract machine
- on which to run.
- Guile implements a number of interpreters and compilers on different
- language levels. For example, there is an interpreter for the Scheme
- language that is itself implemented as a Scheme program compiled to a
- bytecode for a low-level virtual machine shipped with Guile. That
- virtual machine is implemented by both an interpreter---a C program that
- interprets the bytecodes---and a compiler---a C program that dynamically
- translates bytecode programs to native machine code@footnote{Even the
- lowest-level machine code can be thought to be interpreted by the CPU,
- and indeed is often implemented by compiling machine instructions to
- ``micro-operations''.}.
- This section describes the language implemented by Guile's bytecode
- virtual machine, as well as some examples of translations of Scheme
- programs to Guile's VM.
- @menu
- * Why a VM?::
- * VM Concepts::
- * Stack Layout::
- * Variables and the VM::
- * VM Programs::
- * Object File Format::
- * Instruction Set::
- * Just-In-Time Native Code::
- @end menu
- @node Why a VM?
- @subsection Why a VM?
- @cindex interpreter
- For a long time, Guile only had a Scheme interpreter, implemented in C.
- Guile's interpreter operated directly on the S-expression representation
- of Scheme source code.
- But while the interpreter was highly optimized and hand-tuned, it still
- performed many needless computations during the course of evaluating a
- Scheme expression. For example, application of a function to arguments
- needlessly consed up the arguments in a list. Evaluation of an
- expression like @code{(f x y)} always had to figure out whether @var{f}
- was a procedure, or a special form like @code{if}, or something else.
- The interpreter represented the lexical environment as a heap data
- structure, so every evaluation caused allocation, which was of course
- slow. Et cetera.
- The solution to the slow-interpreter problem was to compile the
- higher-level language, Scheme, into a lower-level language for which all
- of the checks and dispatching have already been done---the code is
- instead stripped to the bare minimum needed to ``do the job''.
- The question becomes then, what low-level language to choose? There are
- many options. We could compile to native code directly, but that poses
- portability problems for Guile, as it is a highly cross-platform
- project.
- So we want the performance gains that compilation provides, but we
- also want to maintain the portability benefits of a single code path.
- The obvious solution is to compile to a virtual machine that is
- present on all Guile installations.
- The easiest (and most fun) way to depend on a virtual machine is to
- implement the virtual machine within Guile itself. Guile contains a
- bytecode interpreter (written in C) and a Scheme to bytecode compiler
- (written in Scheme). This way the virtual machine provides what Scheme
- needs (tail calls, multiple values, @code{call/cc}) and can provide
- optimized inline instructions for Guile as well (GC-managed allocations,
- type checks, etc.).
- Guile also includes a just-in-time (JIT) compiler to translate bytecode
- to native code. Because Guile embeds a portable code generation library
- (@url{https://gitlab.com/wingo/lightening}), we keep the benefits of
- portability while also benefitting from fast native code. To avoid too
- much time spent in the JIT compiler itself, Guile is tuned to only emit
- machine code for bytecode that is called often.
- The rest of this section describes that VM that Guile implements, and
- the compiled procedures that run on it.
- Before moving on, though, we should note that though we spoke of the
- interpreter in the past tense, Guile still has an interpreter. The
- difference is that before, it was Guile's main Scheme implementation,
- and so was implemented in highly optimized C; now, it is actually
- implemented in Scheme, and compiled down to VM bytecode, just like any
- other program. (There is still a C interpreter around, used to
- bootstrap the compiler, but it is not normally used at runtime.)
- The upside of implementing the interpreter in Scheme is that we preserve
- tail calls and multiple-value handling between interpreted and compiled
- code, and with advent of the JIT compiler in Guile 3.0 we reach the
- speed of the old hand-tuned C implementation; it's the best of both
- worlds.
- Also note that this decision to implement a bytecode compiler does not
- preclude ahead-of-time native compilation. More possibilities are
- discussed in @ref{Extending the Compiler}.
- @node VM Concepts
- @subsection VM Concepts
- The bytecode in a Scheme procedure is interpreted by a virtual machine
- (VM). Each thread has its own instantiation of the VM. The virtual
- machine executes the sequence of instructions in a procedure.
- Each VM instruction starts by indicating which operation it is, and then
- follows by encoding its source and destination operands. Each procedure
- declares that it has some number of local variables, including the
- function arguments. These local variables form the available operands
- of the procedure, and are accessed by index.
- The local variables for a procedure are stored on a stack. Calling a
- procedure typically enlarges the stack, and returning from a procedure
- shrinks it. Stack memory is exclusive to the virtual machine that owns
- it.
- In addition to their stacks, virtual machines also have access to the
- global memory (modules, global bindings, etc) that is shared among other
- parts of Guile, including other VMs.
- The registers that a VM has are as follows:
- @itemize
- @item ip - Instruction pointer
- @item sp - Stack pointer
- @item fp - Frame pointer
- @end itemize
- In other architectures, the instruction pointer is sometimes called the
- ``program counter'' (pc). This set of registers is pretty typical for
- virtual machines; their exact meanings in the context of Guile's VM are
- described in the next section.
- @node Stack Layout
- @subsection Stack Layout
- The stack of Guile's virtual machine is composed of @dfn{frames}. Each
- frame corresponds to the application of one compiled procedure, and
- contains storage space for arguments, local variables, and some
- bookkeeping information (such as what to do after the frame is
- finished).
- While the compiler is free to do whatever it wants to, as long as the
- semantics of a computation are preserved, in practice every time you
- call a function, a new frame is created. (The notable exception of
- course is the tail call case, @pxref{Tail Calls}.)
- The structure of the top stack frame is as follows:
- @example
- | ...previous frame locals... |
- +==============================+ <- fp + 3
- | Dynamic link |
- +------------------------------+
- | Virtual return address (vRA) |
- +------------------------------+
- | Machine return address (mRA) |
- +==============================+ <- fp
- | Local 0 |
- +------------------------------+
- | Local 1 |
- +------------------------------+
- | ... |
- +------------------------------+
- | Local N-1 |
- \------------------------------/ <- sp
- @end example
- In the above drawing, the stack grows downward. At the beginning of a
- function call, the procedure being applied is in local 0, followed by
- the arguments from local 1. After the procedure checks that it is being
- passed a compatible set of arguments, the procedure allocates some
- additional space in the frame to hold variables local to the function.
- Note that once a value in a local variable slot is no longer needed,
- Guile is free to re-use that slot. This applies to the slots that were
- initially used for the callee and arguments, too. For this reason,
- backtraces in Guile aren't always able to show all of the arguments: it
- could be that the slot corresponding to that argument was re-used by
- some other variable.
- The @dfn{virtual return address} is the @code{ip} that was in effect
- before this program was applied. When we return from this activation
- frame, we will jump back to this @code{ip}. Likewise, the @dfn{dynamic
- link} is the offset of the @code{fp} that was in effect before this
- program was applied, relative to the current @code{fp}.
- There are two return addresses: the virtual return address (vRA), and
- the machine return address (mRA). The vRA is always present and
- indicates a bytecode address. The mRA is only present when a call is
- made from a function with machine code (e.g. a function that has been
- JIT-compiled).
- To prepare for a non-tail application, Guile's VM will emit code that
- shuffles the function to apply and its arguments into appropriate stack
- slots, with three free slots below them. The call then initializes
- those free slots to hold the machine return address (or NULL), the
- virtual return address, and the offset to the previous frame pointer
- (@code{fp}). It then gets the @code{ip} for the function being called
- and adjusts @code{fp} to point to the new call frame.
- In this way, the dynamic link links the current frame to the previous
- frame. Computing a stack trace involves traversing these frames.
- Each stack local in Guile is 64 bits wide, even on 32-bit architectures.
- This allows Guile to preserve its uniform treatment of stack locals
- while allowing for unboxed arithmetic on 64-bit integers and
- floating-point numbers. @xref{Instruction Set}, for more on unboxed
- arithmetic.
- As an implementation detail, we actually store the dynamic link as an
- offset and not an absolute value because the stack can move at runtime
- as it expands or during partial continuation calls. If it were an
- absolute value, we would have to walk the frames, relocating frame
- pointers.
- @node Variables and the VM
- @subsection Variables and the VM
- Consider the following Scheme code as an example:
- @example
- (define (foo a)
- (lambda (b) (vector foo a b)))
- @end example
- Within the lambda expression, @code{foo} is a top-level variable,
- @code{a} is a lexically captured variable, and @code{b} is a local
- variable.
- Another way to refer to @code{a} and @code{b} is to say that @code{a} is
- a ``free'' variable, since it is not defined within the lambda, and
- @code{b} is a ``bound'' variable. These are the terms used in the
- @dfn{lambda calculus}, a mathematical notation for describing functions.
- The lambda calculus is useful because it is a language in which to
- reason precisely about functions and variables. It is especially good
- at describing scope relations, and it is for that reason that we mention
- it here.
- Guile allocates all variables on the stack. When a lexically enclosed
- procedure with free variables---a @dfn{closure}---is created, it copies
- those variables into its free variable vector. References to free
- variables are then redirected through the free variable vector.
- If a variable is ever @code{set!}, however, it will need to be
- heap-allocated instead of stack-allocated, so that different closures
- that capture the same variable can see the same value. Also, this
- allows continuations to capture a reference to the variable, instead
- of to its value at one point in time. For these reasons, @code{set!}
- variables are allocated in ``boxes''---actually, in variable cells.
- @xref{Variables}, for more information. References to @code{set!}
- variables are indirected through the boxes.
- Thus perhaps counterintuitively, what would seem ``closer to the
- metal'', viz @code{set!}, actually forces an extra memory allocation and
- indirection. Sometimes Guile's optimizer can remove this allocation,
- but not always.
- Going back to our example, @code{b} may be allocated on the stack, as
- it is never mutated.
- @code{a} may also be allocated on the stack, as it too is never
- mutated. Within the enclosed lambda, its value will be copied into
- (and referenced from) the free variables vector.
- @code{foo} is a top-level variable, because @code{foo} is not
- lexically bound in this example.
- @node VM Programs
- @subsection Compiled Procedures are VM Programs
- By default, when you enter in expressions at Guile's REPL, they are
- first compiled to bytecode. Then that bytecode is executed to produce a
- value. If the expression evaluates to a procedure, the result of this
- process is a compiled procedure.
- A compiled procedure is a compound object consisting of its bytecode and
- a reference to any captured lexical variables. In addition, when a
- procedure is compiled, it has associated metadata written to side
- tables, for instance a line number mapping, or its docstring. You can
- pick apart these pieces with the accessors in @code{(system vm
- program)}. @xref{Compiled Procedures}, for a full API reference.
- A procedure may reference data that was statically allocated when the
- procedure was compiled. For example, a pair of immediate objects
- (@pxref{Immediate Objects}) can be allocated directly in the memory
- segment that contains the compiled bytecode, and accessed directly by
- the bytecode.
- Another use for statically allocated data is to serve as a cache for a
- bytecode. Top-level variable lookups are handled in this way; the first
- time a top-level binding is referenced, the resolved variable will be
- stored in a cache. Thereafter all access to the variable goes through
- the cache cell. The variable's value may change in the future, but the
- variable itself will not.
- We can see how these concepts tie together by disassembling the
- @code{foo} function we defined earlier to see what is going on:
- @smallexample
- scheme@@(guile-user)> (define (foo a) (lambda (b) (vector foo a b)))
- scheme@@(guile-user)> ,x foo
- Disassembly of #<procedure foo (a)> at #xf1da30:
- 0 (instrument-entry 164) at (unknown file):5:0
- 2 (assert-nargs-ee/locals 2 1) ;; 3 slots (1 arg)
- 3 (allocate-words/immediate 2 3) at (unknown file):5:16
- 4 (load-u64 0 0 65605)
- 7 (word-set!/immediate 2 0 0)
- 8 (load-label 0 7) ;; anonymous procedure at #xf1da6c
- 10 (word-set!/immediate 2 1 0)
- 11 (scm-set!/immediate 2 2 1)
- 12 (reset-frame 1) ;; 1 slot
- 13 (handle-interrupts)
- 14 (return-values)
- ----------------------------------------
- Disassembly of anonymous procedure at #xf1da6c:
- 0 (instrument-entry 183) at (unknown file):5:16
- 2 (assert-nargs-ee/locals 2 3) ;; 5 slots (1 arg)
- 3 (static-ref 2 152) ;; #<variable 112e530 value: #<procedure foo (a)>>
- 5 (immediate-tag=? 2 7 0) ;; heap-object?
- 7 (je 19) ;; -> L2
- 8 (static-ref 2 119) ;; #<directory (guile-user) ca9750>
- 10 (static-ref 1 127) ;; foo
- 12 (call-scm<-scm-scm 2 2 1 40)
- 14 (immediate-tag=? 2 7 0) ;; heap-object?
- 16 (jne 8) ;; -> L1
- 17 (scm-ref/immediate 0 2 1)
- 18 (immediate-tag=? 0 4095 2308) ;; undefined?
- 20 (je 4) ;; -> L1
- 21 (static-set! 2 134) ;; #<variable 112e530 value: #<procedure foo (a)>>
- 23 (j 3) ;; -> L2
- L1:
- 24 (throw/value 1 151) ;; #(unbound-variable #f "Unbound variable: ~S")
- L2:
- 26 (scm-ref/immediate 2 2 1)
- 27 (allocate-words/immediate 1 4) at (unknown file):5:28
- 28 (load-u64 0 0 781)
- 31 (word-set!/immediate 1 0 0)
- 32 (scm-set!/immediate 1 1 2)
- 33 (scm-ref/immediate 4 4 2)
- 34 (scm-set!/immediate 1 2 4)
- 35 (scm-set!/immediate 1 3 3)
- 36 (mov 4 1)
- 37 (reset-frame 1) ;; 1 slot
- 38 (handle-interrupts)
- 39 (return-values)
- @end smallexample
- The first thing to notice is that the bytecode is at a fairly low level.
- When a program is compiled from Scheme to bytecode, it is expressed in
- terms of more primitive operations. As such, there can be more
- instructions than you might expect.
- The first chunk of instructions is the outer @code{foo} procedure. It
- is followed by the code for the contained closure. The code can look
- daunting at first glance, but with practice it quickly becomes
- comprehensible, and indeed being able to read bytecode is an important
- step to understanding the low-level performance of Guile programs.
- The @code{foo} function begins with a prelude. The
- @code{instrument-entry} bytecode increments a counter associated with
- the function. If the counter reaches a certain threshold, Guile will
- emit machine code (``JIT-compile'') for @code{foo}. Emitting machine
- code is fairly cheap but it does take time, so it's not something you
- want to do for every function. Using a per-function counter and a
- global threshold allows Guile to spend time JIT-compiling only the
- ``hot'' functions.
- Next in the prelude is an argument-checking instruction, which checks
- that it was called with only 1 argument (plus the callee function itself
- makes 2) and then reserves stack space for an additional 1 local.
- Then from @code{ip} 3 to 11, we allocate a new closure by allocating a
- three-word object, initializing its first word to store a type tag,
- setting its second word to its code pointer, and finally at @code{ip}
- 11, storing local value 1 (the @code{a} argument) into the third word
- (the first free variable).
- Before returning, @code{foo} ``resets the frame'' to hold only one local
- (the return value), runs any pending interrupts (@pxref{Asyncs}) and
- then returns.
- Note that local variables in Guile's virtual machine are usually
- addressed relative to the stack pointer, which leads to a pleasantly
- efficient @code{sp[@var{n}]} access. However it can make the
- disassembly hard to read, because the @code{sp} can change during the
- function, and because incoming arguments are relative to the @code{fp},
- not the @code{sp}.
- To know what @code{fp}-relative slot corresponds to an
- @code{sp}-relative reference, scan up in the disassembly until you get
- to a ``@var{n} slots'' annotation; in our case, 3, indicating that the
- frame has space for 3 slots. Thus a zero-indexed @code{sp}-relative
- slot of 2 corresponds to the @code{fp}-relative slot of 0, which
- initially held the value of the closure being called. This means that
- Guile doesn't need the value of the closure to compute its result, and
- so slot 0 was free for re-use, in this case for the result of making a
- new closure.
- A closure is code with data. As you can see, making the closure
- involved making an object (@code{ip} 3), putting a code pointer in it
- (@code{ip} 8 and 10), and putting in the closure's free variable
- (@code{ip} 11).
- The second stanza disassembles the code for the closure. After the
- prelude, all of the code between @code{ip} 5 and 24 is related to
- loading the toplevel variable @code{foo} into slot 1. This lookup
- happens only once, and is associated with a cache; after the first run,
- the value in the cache will be a bound variable, and the code will jump
- from @code{ip} 7 to 26. On the first run, Guile gets the module
- associated with the function, calls out to a run-time routine to look up
- the variable, and checks that the variable is bound before initializing
- the cache. Either way, @code{ip} 26 dereferences the variable into
- local 2.
- What follows is the allocation and initialization of the vector return
- value. @code{Ip} 27 does the allocation, and the following two
- instructions initialize the type-and-length tag for the object's first
- word. @code{Ip} 32 sets word 1 of the object (the first vector slot) to
- the value of @code{foo}; @code{ip} 33 fetches the closure variable for
- @code{a}, then in @code{ip} 34 stores it in the second vector slot; and
- finally, in @code{ip} 35, local @code{b} is stored to the third vector
- slot. This is followed by the return sequence.
- @node Object File Format
- @subsection Object File Format
- To compile a file to disk, we need a format in which to write the
- compiled code to disk, and later load it into Guile. A good @dfn{object
- file format} has a number of characteristics:
- @itemize
- @item Above all else, it should be very cheap to load a compiled file.
- @item It should be possible to statically allocate constants in the
- file. For example, a bytevector literal in source code can be emitted
- directly into the object file.
- @item The compiled file should enable maximum code and data sharing
- between different processes.
- @item The compiled file should contain debugging information, such as
- line numbers, but that information should be separated from the code
- itself. It should be possible to strip debugging information if space
- is tight.
- @end itemize
- These characteristics are not specific to Scheme. Indeed, mainstream
- languages like C and C++ have solved this issue many times in the past.
- Guile builds on their work by adopting ELF, the object file format of
- GNU and other Unix-like systems, as its object file format. Although
- Guile uses ELF on all platforms, we do not use platform support for ELF.
- Guile implements its own linker and loader. The advantage of using ELF
- is not sharing code, but sharing ideas. ELF is simply a well-designed
- object file format.
- An ELF file has two meta-tables describing its contents. The first
- meta-table is for the loader, and is called the @dfn{program table} or
- sometimes the @dfn{segment table}. The program table divides the file
- into big chunks that should be treated differently by the loader.
- Mostly the difference between these @dfn{segments} is their
- permissions.
- Typically all segments of an ELF file are marked as read-only, except
- that part that represents modifiable static data or static data that
- needs load-time initialization. Loading an ELF file is as simple as
- mmapping the thing into memory with read-only permissions, then using
- the segment table to mark a small sub-region of the file as writable.
- This writable section is typically added to the root set of the garbage
- collector as well.
- One ELF segment is marked as ``dynamic'', meaning that it has data of
- interest to the loader. Guile uses this segment to record the Guile
- version corresponding to this file. There is also an entry in the
- dynamic segment that points to the address of an initialization thunk
- that is run to perform any needed link-time initialization. (This is
- like dynamic relocations for normal ELF shared objects, except that we
- compile the relocations as a procedure instead of having the loader
- interpret a table of relocations.) Finally, the dynamic segment marks
- the location of the ``entry thunk'' of the object file. This thunk is
- returned to the caller of @code{load-thunk-from-memory} or
- @code{load-thunk-from-file}. When called, it will execute the ``body''
- of the compiled expression.
- The other meta-table in an ELF file is the @dfn{section table}. Whereas
- the program table divides an ELF file into big chunks for the loader,
- the section table specifies small sections for use by introspective
- tools like debuggers or the like. One segment (program table entry)
- typically contains many sections. There may be sections outside of any
- segment, as well.
- Typical sections in a Guile @code{.go} file include:
- @table @code
- @item .rtl-text
- Bytecode.
- @item .data
- Data that needs initialization, or which may be modified at runtime.
- @item .rodata
- Statically allocated data that needs no run-time initialization, and
- which therefore can be shared between processes.
- @item .dynamic
- The dynamic section, discussed above.
- @item .symtab
- @itemx .strtab
- A table mapping addresses in the @code{.rtl-text} to procedure names.
- @code{.strtab} is used by @code{.symtab}.
- @item .guile.procprops
- @itemx .guile.arities
- @itemx .guile.arities.strtab
- @itemx .guile.docstrs
- @itemx .guile.docstrs.strtab
- Side tables of procedure properties, arities, and docstrings.
- @item .guile.docstrs.strtab
- Side table of frame maps, describing the set of live slots for ever
- return point in the program text, and whether those slots are pointers
- are not. Used by the garbage collector.
- @item .debug_info
- @itemx .debug_abbrev
- @itemx .debug_str
- @itemx .debug_loc
- @itemx .debug_line
- Debugging information, in DWARF format. See the DWARF specification,
- for more information.
- @item .shstrtab
- Section name string table.
- @end table
- For more information, see @uref{http://linux.die.net/man/5/elf,,the
- elf(5) man page}. See @uref{http://dwarfstd.org/,the DWARF
- specification} for more on the DWARF debugging format. Or if you are an
- adventurous explorer, try running @code{readelf} or @code{objdump} on
- compiled @code{.go} files. It's good times!
- @node Instruction Set
- @subsection Instruction Set
- There are currently about 150 instructions in Guile's virtual machine.
- These instructions represent atomic units of a program's execution.
- Ideally, they perform one task without conditional branches, then
- dispatch to the next instruction in the stream.
- Instructions themselves are composed of 1 or more 32-bit units. The low
- 8 bits of the first word indicate the opcode, and the rest of
- instruction describe the operands. There are a number of different ways
- operands can be encoded.
- @table @code
- @item s@var{n}
- An unsigned @var{n}-bit integer, indicating the @code{sp}-relative index
- of a local variable.
- @item f@var{n}
- An unsigned @var{n}-bit integer, indicating the @code{fp}-relative index
- of a local variable. Used when a continuation accepts a variable number
- of values, to shuffle received values into known locations in the
- frame.
- @item c@var{n}
- An unsigned @var{n}-bit integer, indicating a constant value.
- @item l24
- An offset from the current @code{ip}, in 32-bit units, as a signed
- 24-bit value. Indicates a bytecode address, for a relative jump.
- @item zi16
- @itemx i16
- @itemx i32
- An immediate Scheme value (@pxref{Immediate Objects}), encoded directly
- in 16 or 32 bits. @code{zi16} is sign-extended; the others are
- zero-extended.
- @item a32
- @itemx b32
- An immediate Scheme value, encoded as a pair of 32-bit words.
- @code{a32} and @code{b32} values always go together on the same opcode,
- and indicate the high and low bits, respectively. Normally only used on
- 64-bit systems.
- @item n32
- A statically allocated non-immediate. The address of the non-immediate
- is encoded as a signed 32-bit integer, and indicates a relative offset
- in 32-bit units. Think of it as @code{SCM x = ip + offset}.
- @item r32
- Indirect scheme value, like @code{n32} but indirected. Think of it as
- @code{SCM *x = ip + offset}.
- @item l32
- @item lo32
- An ip-relative address, as a signed 32-bit integer. Could indicate a
- bytecode address, as in @code{make-closure}, or a non-immediate address,
- as with @code{static-patch!}.
- @code{l32} and @code{lo32} are the same from the perspective of the
- virtual machine. The difference is that an assembler might want to
- allow an @code{lo32} address to be specified as a label and then some
- number of words offset from that label, for example when patching a
- field of a statically allocated object.
- @item v32:x8-l24
- Almost all VM instructions have a fixed size. The @code{jtable}
- instruction used to perform optimized @code{case} branches is an
- exception, which uses a @code{v32} trailing word to indicate the number
- of additional words in the instruction, which themselves are encoded as
- @code{x8-l24} values.
- @item b1
- A boolean value: 1 for true, otherwise 0.
- @item x@var{n}
- An ignored sequence of @var{n} bits.
- @end table
- An instruction is specified by giving its name, then describing its
- operands. The operands are packed by 32-bit words, with earlier
- operands occupying the lower bits.
- For example, consider the following instruction specification:
- @deftypefn Instruction {} call f24:@var{proc} x8:@var{_} c24:@var{nlocals}
- @end deftypefn
- The first word in the instruction will start with the 8-bit value
- corresponding to the @var{call} opcode in the low bits, followed by
- @var{proc} as a 24-bit value. The second word starts with 8 dead bits,
- followed by the index as a 24-bit immediate value.
- For instructions with operands that encode references to the stack, the
- interpretation of those stack values is up to the instruction itself.
- Most instructions expect their operands to be tagged SCM values
- (@code{scm} representation), but some instructions expect unboxed
- integers (@code{u64} and @code{s64} representations) or floating-point
- numbers (@code{f64} representation). It is assumed that the bits for a
- @code{u64} value are the same as those for an @code{s64} value, and that
- @code{s64} values are stored in two's complement.
- Instructions have static types: they must receive their operands in the
- format they expect. It's up to the compiler to ensure this is the case.
- Unless otherwise mentioned, all operands and results are in the
- @code{scm} representation.
- @menu
- * Call and Return Instructions::
- * Function Prologue Instructions::
- * Shuffling Instructions::
- * Trampoline Instructions::
- * Non-Local Control Flow Instructions::
- * Instrumentation Instructions::
- * Intrinsic Call Instructions::
- * Constant Instructions::
- * Memory Access Instructions::
- * Atomic Memory Access Instructions::
- * Tagging and Untagging Instructions::
- * Integer Arithmetic Instructions::
- * Floating-Point Arithmetic Instructions::
- * Comparison Instructions::
- * Branch Instructions::
- * Raw Memory Access Instructions::
- @end menu
- @node Call and Return Instructions
- @subsubsection Call and Return Instructions
- As described earlier (@pxref{Stack Layout}), Guile's calling convention
- is that arguments are passed and values returned on the stack.
- For calls, both in tail position and in non-tail position, we require
- that the procedure and the arguments already be shuffled into place
- before the call instruction. ``Into place'' for a tail call means that
- the procedure should be in slot 0, relative to the @code{fp}, and the
- arguments should follow. For a non-tail call, if the procedure is in
- @code{fp}-relative slot @var{n}, the arguments should follow from slot
- @var{n}+1, and there should be three free slots between @var{n}-1 and
- @var{n}-3 in which to save the mRA, vRA, and @code{fp}.
- Returning values is similar. Multiple-value returns should have values
- already shuffled down to start from @code{fp}-relative slot 0 before
- emitting @code{return-values}.
- In both calls and returns, the @code{sp} is used to indicate to the
- callee or caller the number of arguments or return values, respectively.
- After receiving return values, it is the caller's responsibility to
- @dfn{restore the frame} by resetting the @code{sp} to its former value.
- @deftypefn Instruction {} call f24:@var{proc} x8:@var{_} c24:@var{nlocals}
- Call a procedure. @var{proc} is the local corresponding to a procedure.
- The three values below @var{proc} will be overwritten by the saved call
- frame data. The new frame will have space for @var{nlocals} locals: one
- for the procedure, and the rest for the arguments which should already
- have been pushed on.
- When the call returns, execution proceeds with the next instruction.
- There may be any number of values on the return stack; the precise
- number can be had by subtracting the address of @var{proc}-1 from the
- post-call @code{sp}.
- @end deftypefn
- @deftypefn Instruction {} call-label f24:@var{proc} x8:@var{_} c24:@var{nlocals} l32:@var{label}
- Call a procedure in the same compilation unit.
- This instruction is just like @code{call}, except that instead of
- dereferencing @var{proc} to find the call target, the call target is
- known to be at @var{label}, a signed 32-bit offset in 32-bit units from
- the current @code{ip}. Since @var{proc} is not dereferenced, it may be
- some other representation of the closure.
- @end deftypefn
- @deftypefn Instruction {} tail-call x24:@var{_}
- Tail-call a procedure. Requires that the procedure and all of the
- arguments have already been shuffled into position, and that the frame
- has already been reset to the number of arguments to the call.
- @end deftypefn
- @deftypefn Instruction {} tail-call-label x24:@var{_} l32:@var{label}
- Tail-call a known procedure. As @code{call} is to @code{call-label},
- @code{tail-call} is to @code{tail-call-label}.
- @end deftypefn
- @deftypefn Instruction {} return-values x24:@var{_}
- Return a number of values from a call frame. The return values should
- have already been shuffled down to a contiguous array starting at slot
- 0, and the frame already reset.
- @end deftypefn
- @deftypefn Instruction {} receive f12:@var{dst} f12:@var{proc} x8:@var{_} c24:@var{nlocals}
- Receive a single return value from a call whose procedure was in
- @var{proc}, asserting that the call actually returned at least one
- value. Afterwards, resets the frame to @var{nlocals} locals.
- @end deftypefn
- @deftypefn Instruction {} receive-values f24:@var{proc} b1:@var{allow-extra?} x7:@var{_} c24:@var{nvalues}
- Receive a return of multiple values from a call whose procedure was in
- @var{proc}. If fewer than @var{nvalues} values were returned, signal an
- error. Unless @var{allow-extra?} is true, require that the number of
- return values equals @var{nvalues} exactly. After @code{receive-values}
- has run, the values can be copied down via @code{mov}, or used in place.
- @end deftypefn
- @node Function Prologue Instructions
- @subsubsection Function Prologue Instructions
- A function call in Guile is very cheap: the VM simply hands control to
- the procedure. The procedure itself is responsible for asserting that it
- has been passed an appropriate number of arguments. This strategy allows
- arbitrarily complex argument parsing idioms to be developed, without
- harming the common case.
- For example, only calls to keyword-argument procedures ``pay'' for the
- cost of parsing keyword arguments. (At the time of this writing, calling
- procedures with keyword arguments is typically two to four times as
- costly as calling procedures with a fixed set of arguments.)
- @deftypefn Instruction {} assert-nargs-ee c24:@var{expected}
- @deftypefnx Instruction {} assert-nargs-ge c24:@var{expected}
- @deftypefnx Instruction {} assert-nargs-le c24:@var{expected}
- If the number of actual arguments is not @code{==}, @code{>=}, or
- @code{<=} @var{expected}, respectively, signal an error.
- The number of arguments is determined by subtracting the stack pointer
- from the frame pointer (@code{fp - sp}). @xref{Stack Layout}, for more
- details on stack frames. Note that @var{expected} includes the
- procedure itself.
- @end deftypefn
- @deftypefn Instruction {} arguments<=? c24:@var{expected}
- Set the @code{LESS_THAN}, @code{EQUAL}, or @code{NONE} comparison result
- values if the number of arguments is respectively less than, equal to,
- or greater than @var{expected}.
- @end deftypefn
- @deftypefn Instruction {} positional-arguments<=? c24:@var{nreq} x8:@var{_} c24:@var{expected}
- Set the @code{LESS_THAN}, @code{EQUAL}, or @code{NONE} comparison result
- values if the number of positional arguments is respectively less than,
- equal to, or greater than @var{expected}. The first @var{nreq}
- arguments are positional arguments, as are the subsequent arguments that
- are not keywords.
- @end deftypefn
- The @code{arguments<=?} and @code{positional-arguments<=?} instructions
- are used to implement multiple arities, as in @code{case-lambda}.
- @xref{Case-lambda}, for more information. @xref{Branch Instructions},
- for more on comparison results.
- @deftypefn Instruction {} bind-kwargs c24:@var{nreq} c8:@var{flags} c24:@var{nreq-and-opt} x8:@var{_} c24:@var{ntotal} n32:@var{kw-offset}
- @var{flags} is a bitfield, whose lowest bit is @var{allow-other-keys},
- second bit is @var{has-rest}, and whose following six bits are unused.
- Find the last positional argument, and shuffle all the rest above
- @var{ntotal}. Initialize the intervening locals to
- @code{SCM_UNDEFINED}. Then load the constant at @var{kw-offset} words
- from the current @var{ip}, and use it and the @var{allow-other-keys}
- flag to bind keyword arguments. If @var{has-rest}, collect all shuffled
- arguments into a list, and store it in @var{nreq-and-opt}. Finally,
- clear the arguments that we shuffled up.
- The parsing is driven by a keyword arguments association list, looked up
- using @var{kw-offset}. The alist is a list of pairs of the form
- @code{(@var{kw} . @var{index})}, mapping keyword arguments to their
- local slot indices. Unless @code{allow-other-keys} is set, the parser
- will signal an error if an unknown key is found.
- A macro-mega-instruction.
- @end deftypefn
- @deftypefn Instruction {} bind-optionals f24:@var{nlocals}
- Expand the current frame to have at least @var{nlocals} locals, filling
- in any fresh values with @code{SCM_UNDEFINED}. If the frame has more
- than @var{nlocals} locals, it is left as it is.
- @end deftypefn
- @deftypefn Instruction {} bind-rest f24:@var{dst}
- Collect any arguments at or above @var{dst} into a list, and store that
- list at @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} alloc-frame c24:@var{nlocals}
- Ensure that there is space on the stack for @var{nlocals} local
- variables. The value of any new local is undefined.
- @end deftypefn
- @deftypefn Instruction {} reset-frame c24:@var{nlocals}
- Like @code{alloc-frame}, but doesn't check that the stack is big enough,
- and doesn't initialize values to @code{SCM_UNDEFINED}. Used to reset
- the frame size to something less than the size that was previously set
- via alloc-frame.
- @end deftypefn
- @deftypefn Instruction {} assert-nargs-ee/locals c12:@var{expected} c12:@var{nlocals}
- Equivalent to a sequence of @code{assert-nargs-ee} and
- @code{allocate-frame}. The number of locals reserved is @var{expected}
- + @var{nlocals}.
- @end deftypefn
- @node Shuffling Instructions
- @subsubsection Shuffling Instructions
- These instructions are used to move around values on the stack.
- @deftypefn Instruction {} mov s12:@var{dst} s12:@var{src}
- @deftypefnx Instruction {} long-mov s24:@var{dst} x8:@var{_} s24:@var{src}
- Copy a value from one local slot to another.
- As discussed previously, procedure arguments and local variables are
- allocated to local slots. Guile's compiler tries to avoid shuffling
- variables around to different slots, which often makes @code{mov}
- instructions redundant. However there are some cases in which shuffling
- is necessary, and in those cases, @code{mov} is the thing to use.
- @end deftypefn
- @deftypefn Instruction {} long-fmov f24:@var{dst} x8:@var{_} f24:@var{src}
- Copy a value from one local slot to another, but addressing slots
- relative to the @code{fp} instead of the @code{sp}. This is used when
- shuffling values into place after multiple-value returns.
- @end deftypefn
- @deftypefn Instruction {} push s24:@var{src}
- Bump the stack pointer by one word, and fill it with the value from slot
- @var{src}. The offset to @var{src} is calculated before the stack
- pointer is adjusted.
- @end deftypefn
- The @code{push} instruction is used when another instruction is unable
- to address an operand because the operand is encoded with fewer than 24
- bits. In that case, Guile's assembler will transparently emit code that
- temporarily pushes any needed operands onto the stack, emits the
- original instruction to address those now-near variables, then shuffles
- the result (if any) back into place.
- @deftypefn Instruction {} pop s24:@var{dst}
- Pop the stack pointer, storing the value that was there in slot
- @var{dst}. The offset to @var{dst} is calculated after the stack
- pointer is adjusted.
- @end deftypefn
- @deftypefn Instruction {} drop c24:@var{count}
- Pop the stack pointer by @var{count} words, discarding any values that
- were stored there.
- @end deftypefn
- @deftypefn Instruction {} shuffle-down f12:@var{from} f12:@var{to}
- Shuffle down values from @var{from} to @var{to}, reducing the frame size
- by @var{FROM}-@var{TO} slots. Part of the internal implementation of
- @code{call-with-values}, @code{values}, and @code{apply}.
- @end deftypefn
- @deftypefn Instruction {} expand-apply-argument x24:@var{_}
- Take the last local in a frame and expand it out onto the stack, as for
- the last argument to @code{apply}.
- @end deftypefn
- @node Trampoline Instructions
- @subsubsection Trampoline Instructions
- Though most applicable objects in Guile are procedures implemented in
- bytecode, not all are. There are primitives, continuations, and other
- procedure-like objects that have their own calling convention. Instead
- of adding special cases to the @code{call} instruction, Guile wraps
- these other applicable objects in VM trampoline procedures, then
- provides special support for these objects in bytecode.
- Trampoline procedures are typically generated by Guile at runtime, for
- example in response to a call to @code{scm_c_make_gsubr}. As such, a
- compiler probably shouldn't emit code with these instructions. However,
- it's still interesting to know how these things work, so we document
- these trampoline instructions here.
- @deftypefn Instruction {} subr-call c24:@var{idx}
- Call a subr, passing all locals in this frame as arguments, and storing
- the results on the stack, ready to be returned.
- @end deftypefn
- @deftypefn Instruction {} foreign-call c12:@var{cif-idx} c12:@var{ptr-idx}
- Call a foreign function. Fetch the @var{cif} and foreign pointer from
- @var{cif-idx} and @var{ptr-idx} closure slots of the callee. Arguments
- are taken from the stack, and results placed on the stack, ready to be
- returned.
- @end deftypefn
- @deftypefn Instruction {} builtin-ref s12:@var{dst} c12:@var{idx}
- Load a builtin stub by index into @var{dst}.
- @end deftypefn
- @node Non-Local Control Flow Instructions
- @subsubsection Non-Local Control Flow Instructions
- @deftypefn Instruction {} capture-continuation s24:@var{dst}
- Capture the current continuation, and write it to @var{dst}. Part of
- the implementation of @code{call/cc}.
- @end deftypefn
- @deftypefn Instruction {} continuation-call c24:@var{contregs}
- Return to a continuation, nonlocally. The arguments to the continuation
- are taken from the stack. @var{contregs} is a free variable containing
- the reified continuation.
- @end deftypefn
- @deftypefn Instruction {} abort x24:@var{_}
- Abort to a prompt handler. The tag is expected in slot 1, and the rest
- of the values in the frame are returned to the prompt handler. This
- corresponds to a tail application of @code{abort-to-prompt}.
- If no prompt can be found in the dynamic environment with the given tag,
- an error is signalled. Otherwise all arguments are passed to the
- prompt's handler, along with the captured continuation, if necessary.
- If the prompt's handler can be proven to not reference the captured
- continuation, no continuation is allocated. This decision happens
- dynamically, at run-time; the general case is that the continuation may
- be captured, and thus resumed. A reinstated continuation will have its
- arguments pushed on the stack from slot 0, as if from a multiple-value
- return, and control resumes in the caller. Thus to the calling
- function, a call to @code{abort-to-prompt} looks like any other function
- call.
- @end deftypefn
- @deftypefn Instruction {} compose-continuation c24:@var{cont}
- Compose a partial continuation with the current continuation. The
- arguments to the continuation are taken from the stack. @var{cont} is a
- free variable containing the reified continuation.
- @end deftypefn
- @deftypefn Instruction {} prompt s24:@var{tag} b1:@var{escape-only?} x7:@var{_} f24:@var{proc-slot} x8:@var{_} l24:@var{handler-offset}
- Push a new prompt on the dynamic stack, with a tag from @var{tag} and a
- handler at @var{handler-offset} words from the current @var{ip}.
- If an abort is made to this prompt, control will jump to the handler.
- The handler will expect a multiple-value return as if from a call with
- the procedure at @var{proc-slot}, with the reified partial continuation
- as the first argument, followed by the values returned to the handler.
- If control returns to the handler, the prompt is already popped off by
- the abort mechanism. (Guile's @code{prompt} implements Felleisen's
- @dfn{--F--} operator.)
- If @var{escape-only?} is nonzero, the prompt will be marked as
- escape-only, which allows an abort to this prompt to avoid reifying the
- continuation.
- @xref{Prompts}, for more information on prompts.
- @end deftypefn
- @deftypefn Instruction {} throw s12:@var{key} s12:@var{args}
- Raise an error by throwing to @var{key} and @var{args}. @var{args}
- should be a list.
- @end deftypefn
- @deftypefn Instruction {} throw/value s24:@var{value} n32:@var{key-subr-and-message}
- @deftypefnx Instruction {} throw/value+data s24:@var{value} n32:@var{key-subr-and-message}
- Raise an error, indicating @var{val} as the bad value.
- @var{key-subr-and-message} should be a vector, where the first element
- is the symbol to which to throw, the second is the procedure in which to
- signal the error (a string) or @code{#f}, and the third is a format
- string for the message, with one template. These instructions do not
- fall through.
- Both of these instructions throw to a key with four arguments: the
- procedure that indicates the error (or @code{#f}, the format string, a
- list with @var{value}, and either @code{#f} or the list with @var{value}
- as the last argument respectively.
- @end deftypefn
- @node Instrumentation Instructions
- @subsubsection Instrumentation Instructions
- @deftypefn Instruction {} instrument-entry x24_@var{_} n32:@var{data}
- @deftypefnx Instruction {} instrument-loop x24_@var{_} n32:@var{data}
- Increase execution counter for this function and potentially tier up to
- the next JIT level. @var{data} is an offset to a structure recording
- execution counts and the next-level JIT code corresponding to this
- function. The increment values are currently 30 for
- @code{instrument-entry} and 2 for @code{instrument-loop}.
- @code{instrument-entry} will also run the apply hook, if VM hooks are
- enabled.
- @end deftypefn
- @deftypefn Instruction {} handle-interrupts x24:@var{_}
- Handle pending asynchronous interrupts (asyncs). @xref{Asyncs}. The
- compiler inserts @code{handle-interrupts} instructions before any call,
- return, or loop back-edge.
- @end deftypefn
- @deftypefn Instruction {} return-from-interrupt x24:@var{_}
- A special instruction to return from a call and also pop off the stack
- frame from the call. Used when returning from asynchronous interrupts.
- @end deftypefn
- @node Intrinsic Call Instructions
- @subsubsection Intrinsic Call Instructions
- Guile's instruction set is low-level. This is good because the separate
- components of, say, a @code{vector-ref} operation might be able to be
- optimized out, leaving only the operations that need to be performed at
- run-time.
- However some macro-operations may need to perform large amounts of
- computation at run-time to handle all the edge cases, and whose
- micro-operation components aren't amenable to optimization.
- Residualizing code for the entire macro-operation would lead to code
- bloat with no benefit.
- In this kind of a case, Guile's VM calls out to @dfn{intrinsics}:
- run-time routines written in the host language (currently C, possibly
- more in the future if Guile gains more run-time targets like
- WebAssembly). There is one instruction for each instrinsic prototype;
- the intrinsic is specified by index in the instruction.
- @deftypefn Instruction {} call-thread x24:@var{_} c32:@var{idx}
- Call the @code{void}-returning instrinsic with index @var{idx}, passing
- the current @code{scm_thread*} as the argument.
- @end deftypefn
- @deftypefn Instruction {} call-thread-scm s24:@var{a} c32:@var{idx}
- Call the @code{void}-returning instrinsic with index @var{idx}, passing
- the current @code{scm_thread*} and the @code{scm} local @var{a} as
- arguments.
- @end deftypefn
- @deftypefn Instruction {} call-thread-scm-scm s12:@var{a} s12:@var{b} c32:@var{idx}
- Call the @code{void}-returning instrinsic with index @var{idx}, passing
- the current @code{scm_thread*} and the @code{scm} locals @var{a} and
- @var{b} as arguments.
- @end deftypefn
- @deftypefn Instruction {} call-scm-sz-u32 s12:@var{a} s12:@var{b} c32:@var{idx}
- Call the @code{void}-returning instrinsic with index @var{idx}, passing
- the locals @var{a}, @var{b}, and @var{c} as arguments. @var{a} is a
- @code{scm} value, while @var{b} and @var{c} are raw @code{u64} values
- which fit into @code{size_t} and @code{uint32_t} types, respectively.
- @end deftypefn
- @deftypefn Instruction {} call-scm<-thread s24:@var{dst} c32:@var{idx}
- Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
- the current @code{scm_thread*} as the argument. Place the result in
- @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} call-scm<-u64 s12:@var{dst} s12:@var{a} c32:@var{idx}
- Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
- @code{u64} local @var{a} as the argument. Place the result in
- @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} call-scm<-s64 s12:@var{dst} s12:@var{a} c32:@var{idx}
- Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
- @code{s64} local @var{a} as the argument. Place the result in
- @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} call-scm<-scm s12:@var{dst} s12:@var{a} c32:@var{idx}
- Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
- @code{scm} local @var{a} as the argument. Place the result in
- @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} call-u64<-scm s12:@var{dst} s12:@var{a} c32:@var{idx}
- Call the @code{uint64_t}-returning instrinsic with index @var{idx},
- passing @code{scm} local @var{a} as the argument. Place the @code{u64}
- result in @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} call-s64<-scm s12:@var{dst} s12:@var{a} c32:@var{idx}
- Call the @code{int64_t}-returning instrinsic with index @var{idx},
- passing @code{scm} local @var{a} as the argument. Place the @code{s64}
- result in @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} call-f64<-scm s12:@var{dst} s12:@var{a} c32:@var{idx}
- Call the @code{double}-returning instrinsic with index @var{idx},
- passing @code{scm} local @var{a} as the argument. Place the @code{f64}
- result in @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} call-scm<-scm-scm s8:@var{dst} s8:@var{a} s8:@var{b} c32:@var{idx}
- Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
- @code{scm} locals @var{a} and @var{b} as arguments. Place the
- @code{scm} result in @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} call-scm<-scm-uimm s8:@var{dst} s8:@var{a} c8:@var{b} c32:@var{idx}
- Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
- @code{scm} local @var{a} and @code{uint8_t} immediate @var{b} as
- arguments. Place the @code{scm} result in @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} call-scm<-thread-scm s12:@var{dst} s12:@var{a} c32:@var{idx}
- Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
- the current @code{scm_thread*} and @code{scm} local @var{a} as
- arguments. Place the @code{scm} result in @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} call-scm<-scm-u64 s8:@var{dst} s8:@var{a} s8:@var{b} c32:@var{idx}
- Call the @code{SCM}-returning instrinsic with index @var{idx}, passing
- @code{scm} local @var{a} and @code{u64} local @var{b} as arguments.
- Place the @code{scm} result in @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} call-scm-scm s12:@var{a} s12:@var{b} c32:@var{idx}
- Call the @code{void}-returning instrinsic with index @var{idx}, passing
- @code{scm} locals @var{a} and @var{b} as arguments.
- @end deftypefn
- @deftypefn Instruction {} call-scm-scm-scm s8:@var{a} s8:@var{b} s8:@var{c} c32:@var{idx}
- Call the @code{void}-returning instrinsic with index @var{idx}, passing
- @code{scm} locals @var{a}, @var{b}, and @var{c} as arguments.
- @end deftypefn
- @deftypefn Instruction {} call-scm-uimm-scm s8:@var{a} c8:@var{b} s8:@var{c} c32:@var{idx}
- Call the @code{void}-returning instrinsic with index @var{idx}, passing
- @code{scm} local @var{a}, @code{uint8_t} immediate @var{b}, and
- @code{scm} local @var{c} as arguments.
- @end deftypefn
- There are corresponding macro-instructions for specific intrinsics.
- These are equivalent to @code{call-@var{instrinsic-kind}} instructions
- with the appropriate intrinsic @var{idx} arguments.
- @deffn {Macro Instruction} add dst a b
- @deffnx {Macro Instruction} add/immediate dst a b/imm
- Add @code{SCM} values @var{a} and @var{b} and place the result in
- @var{dst}.
- @end deffn
- @deffn {Macro Instruction} sub dst a b
- @deffnx {Macro Instruction} sub/immediate dst a b/imm
- Subtract @code{SCM} value @var{b} from @var{a} and place the result in
- @var{dst}.
- @end deffn
- @deffn {Macro Instruction} mul dst a b
- Multiply @code{SCM} values @var{a} and @var{b} and place the result in
- @var{dst}.
- @end deffn
- @deffn {Macro Instruction} div dst a b
- Divide @code{SCM} value @var{a} by @var{b} and place the result in
- @var{dst}.
- @end deffn
- @deffn {Macro Instruction} quo dst a b
- Compute the quotient of @code{SCM} values @var{a} and @var{b} and place
- the result in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} rem dst a b
- Compute the remainder of @code{SCM} values @var{a} and @var{b} and place
- the result in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} mod dst a b
- Compute the modulo of @code{SCM} value @var{a} by @var{b} and place the
- result in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} logand dst a b
- Compute the bitwise @code{and} of @code{SCM} values @var{a} and @var{b}
- and place the result in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} logior dst a b
- Compute the bitwise inclusive @code{or} of @code{SCM} values @var{a} and
- @var{b} and place the result in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} logxor dst a b
- Compute the bitwise exclusive @code{or} of @code{SCM} values @var{a} and
- @var{b} and place the result in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} logsub dst a b
- Compute the bitwise @code{and} of @code{SCM} value @var{a} and the
- bitwise @code{not} of @var{b} and place the result in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} lsh dst a b
- @deffnx {Macro Instruction} lsh/immediate a b/imm
- Shift @code{SCM} value @var{a} left by @code{u64} value @var{b} bits and
- place the result in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} rsh dst a b
- @deffnx {Macro Instruction} rsh/immediate dst a b/imm
- Shifts @code{SCM} value @var{a} right by @code{u64} value @var{b} bits
- and place the result in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} scm->f64 dst src
- Convert @var{src} to an unboxed @code{f64} and place the result in
- @var{dst}, or raises an error if @var{src} is not a real number.
- @end deffn
- @deffn {Macro Instruction} scm->u64 dst src
- Convert @var{src} to an unboxed @code{u64} and place the result in
- @var{dst}, or raises an error if @var{src} is not an integer within
- range.
- @end deffn
- @deffn {Macro Instruction} scm->u64/truncate dst src
- Convert @var{src} to an unboxed @code{u64} and place the result in
- @var{dst}, truncating to the low 64 bits, or raises an error if
- @var{src} is not an integer.
- @end deffn
- @deffn {Macro Instruction} scm->s64 dst src
- Convert @var{src} to an unboxed @code{s64} and place the result in
- @var{dst}, or raises an error if @var{src} is not an integer within
- range.
- @end deffn
- @deffn {Macro Instruction} u64->scm dst src
- Convert @var{u64} value @var{src} to a Scheme integer in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} s64->scm scm<-s64
- Convert @var{s64} value @var{src} to a Scheme integer in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} string-set! str idx ch
- Sets the character @var{idx} (a @code{u64}) of string @var{str} to
- @var{ch} (a @code{u64} that is a valid character value).
- @end deffn
- @deffn {Macro Instruction} string->number dst src
- Call @code{string->number} on @var{src} and place the result in
- @var{dst}.
- @end deffn
- @deffn {Macro Instruction} string->symbol dst src
- Call @code{string->symbol} on @var{src} and place the result in
- @var{dst}.
- @end deffn
- @deffn {Macro Instruction} symbol->keyword dst src
- Call @code{symbol->keyword} on @var{src} and place the result in
- @var{dst}.
- @end deffn
- @deffn {Macro Instruction} class-of dst src
- Set @var{dst} to the GOOPS class of @code{src}.
- @end deffn
- @deffn {Macro Instruction} wind winder unwinder
- Push wind and unwind procedures onto the dynamic stack. Note that
- neither are actually called; the compiler should emit calls to
- @var{winder} and @var{unwinder} for the normal dynamic-wind control
- flow. Also note that the compiler should have inserted checks that
- @var{winder} and @var{unwinder} are thunks, if it could not prove that
- to be the case. @xref{Dynamic Wind}.
- @end deffn
- @deffn {Macro Instruction} unwind
- Exit from the dynamic extent of an expression, popping the top entry off
- of the dynamic stack.
- @end deffn
- @deffn {Macro Instruction} push-fluid fluid value
- Dynamically bind @var{value} to @var{fluid} by creating a with-fluids
- object, pushing that object on the dynamic stack. @xref{Fluids and
- Dynamic States}.
- @end deffn
- @deffn {Macro Instruction} pop-fluid
- Leave the dynamic extent of a @code{with-fluid*} expression, restoring
- the fluid to its previous value. @code{push-fluid} should always be
- balanced with @code{pop-fluid}.
- @end deffn
- @deffn {Macro Instruction} fluid-ref dst fluid
- Place the value associated with the fluid @var{fluid} in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} fluid-set! fluid value
- Set the value of the fluid @var{fluid} to @var{value}.
- @end deffn
- @deffn {Macro Instruction} push-dynamic-state state
- Save the current set of fluid bindings on the dynamic stack and instate
- the bindings from @var{state} instead. @xref{Fluids and Dynamic
- States}.
- @end deffn
- @deffn {Macro Instruction} pop-dynamic-state
- Restore a saved set of fluid bindings from the dynamic stack.
- @code{push-dynamic-state} should always be balanced with
- @code{pop-dynamic-state}.
- @end deffn
- @deffn {Macro Instruction} resolve-module dst name public?
- Look up the module named @var{name}, resolve its public interface if the
- immediate operand @var{public?} is true, then place the result in
- @var{dst}.
- @end deffn
- @deffn {Macro Instruction} lookup dst mod sym
- Look up @var{sym} in module @var{mod}, placing the resulting variable
- (or @code{#f} if not found) in @var{dst}.
- @end deffn
- @deffn {Macro Instruction} define! dst mod sym
- Look up @var{sym} in module @var{mod}, placing the resulting variable in
- @var{dst}, creating the variable if needed.
- @end deffn
- @deffn {Macro Instruction} current-module dst
- Set @var{dst} to the current module.
- @end deffn
- @deffn {Macro Instruction} $car dst src
- @deffnx {Macro Instruction} $cdr dst src
- @deffnx {Macro Instruction} $set-car! x val
- @deffnx {Macro Instruction} $set-cdr! x val
- @deffnx {Macro Instruction} $variable-ref dst src
- @deffnx {Macro Instruction} $variable-set! x val
- @deffnx {Macro Instruction} $vector-length dst x
- @deffnx {Macro Instruction} $vector-ref dst x idx
- @deffnx {Macro Instruction} $vector-ref/immediate dst x idx/imm
- @deffnx {Macro Instruction} $vector-set! x idx v
- @deffnx {Macro Instruction} $vector-set!/immediate x idx/imm v
- @deffnx {Macro Instruction} $allocate-struct dst vtable nwords
- @deffnx {Macro Instruction} $struct-vtable dst src
- @deffnx {Macro Instruction} $struct-ref dst src idx
- @deffnx {Macro Instruction} $struct-ref/immediate dst src idx/imm
- @deffnx {Macro Instruction} $struct-set! x idx v
- @deffnx {Macro Instruction} $struct-set!/immediate x idx/imm v
- Intrinsics for use by the baseline compiler. The usual strategy for CPS
- compilation is to expose the component parts of e.g. @code{vector-ref}
- so that the compiler can learn from them and eliminate needless bits.
- However in the non-optimizing baseline compiler, that's just overhead,
- so we have some intrinsics that encapsulate all the usual type checks.
- @end deffn
- @node Constant Instructions
- @subsubsection Constant Instructions
- The following instructions load literal data into a program. There are
- two kinds.
- The first set of instructions loads immediate values. These
- instructions encode the immediate directly into the instruction stream.
- @deftypefn Instruction {} make-immediate s8:@var{dst} zi16:@var{low-bits}
- Make an immediate whose low bits are @var{low-bits}, sign-extended.
- @end deftypefn
- @deftypefn Instruction {} make-short-immediate s8:@var{dst} i16:@var{low-bits}
- Make an immediate whose low bits are @var{low-bits}, and whose top bits are
- 0.
- @end deftypefn
- @deftypefn Instruction {} make-long-immediate s24:@var{dst} i32:@var{low-bits}
- Make an immediate whose low bits are @var{low-bits}, and whose top bits are
- 0.
- @end deftypefn
- @deftypefn Instruction {} make-long-long-immediate s24:@var{dst} a32:@var{high-bits} b32:@var{low-bits}
- Make an immediate with @var{high-bits} and @var{low-bits}.
- @end deftypefn
- Non-immediate constant literals are referenced either directly or
- indirectly. For example, Guile knows at compile-time what the layout of
- a string will be like, and arranges to embed that object directly in the
- compiled image. A reference to a string will use
- @code{make-non-immediate} to treat a pointer into the compilation unit
- as a @code{scm} value directly.
- @deftypefn Instruction {} make-non-immediate s24:@var{dst} n32:@var{offset}
- Load a pointer to statically allocated memory into @var{dst}. The
- object's memory will be found @var{offset} 32-bit words away from the
- current instruction pointer. Whether the object is mutable or immutable
- depends on where it was allocated by the compiler, and loaded by the
- loader.
- @end deftypefn
- Sometimes you need to load up a code pointer into a register; for this,
- use @code{load-label}.
- @deftypefn Instruction {} load-label s24:@var{dst} l32:@var{offset}
- Load a label @var{offset} words away from the current @code{ip} and
- write it to @var{dst}. @var{offset} is a signed 32-bit integer.
- @end deftypefn
- Finally, Guile supports a number of unboxed data types, with their
- associate constant loaders.
- @deftypefn Instruction {} load-f64 s24:@var{dst} au32:@var{high-bits} au32:@var{low-bits}
- Load a double-precision floating-point value formed by joining
- @var{high-bits} and @var{low-bits}, and write it to @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} load-u64 s24:@var{dst} au32:@var{high-bits} au32:@var{low-bits}
- Load an unsigned 64-bit integer formed by joining @var{high-bits} and
- @var{low-bits}, and write it to @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} load-s64 s24:@var{dst} au32:@var{high-bits} au32:@var{low-bits}
- Load a signed 64-bit integer formed by joining @var{high-bits} and
- @var{low-bits}, and write it to @var{dst}.
- @end deftypefn
- Some objects must be unique across the whole system. This is the case
- for symbols and keywords. For these objects, Guile arranges to
- initialize them when the compilation unit is loaded, storing them into a
- slot in the image. References go indirectly through that slot.
- @code{static-ref} is used in this case.
- @deftypefn Instruction {} static-ref s24:@var{dst} r32:@var{offset}
- Load a @var{scm} value into @var{dst}. The @var{scm} value will be fetched from
- memory, @var{offset} 32-bit words away from the current instruction
- pointer. @var{offset} is a signed value.
- @end deftypefn
- Fields of non-immediates may need to be fixed up at load time, because
- we do not know in advance at what address they will be loaded. This is
- the case, for example, for a pair containing a non-immediate in one of
- its fields. @code{static-set!} and @code{static-patch!} are used in
- these situations.
- @deftypefn Instruction {} static-set! s24:@var{src} lo32:@var{offset}
- Store a @var{scm} value into memory, @var{offset} 32-bit words away from the
- current instruction pointer. @var{offset} is a signed value.
- @end deftypefn
- @deftypefn Instruction {} static-patch! x24:@var{_} lo32:@var{dst-offset} l32:@var{src-offset}
- Patch a pointer at @var{dst-offset} to point to @var{src-offset}. Both offsets
- are signed 32-bit values, indicating a memory address as a number
- of 32-bit words away from the current instruction pointer.
- @end deftypefn
- @node Memory Access Instructions
- @subsubsection Memory Access Instructions
- In these instructions, the @code{/immediate} variants represent their
- indexes or counts as immediates; otherwise these values are unboxed u64
- locals.
- @deftypefn Instruction {} allocate-words s12:@var{dst} s12:@var{count}
- @deftypefnx Instruction {} allocate-words/immediate s12:@var{dst} c12:@var{count}
- Allocate a fresh GC-traced object consisting of @var{count} words and
- store it into @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} scm-ref s8:@var{dst} s8:@var{obj} s8:@var{idx}
- @deftypefnx Instruction {} scm-ref/immediate s8:@var{dst} s8:@var{obj} c8:@var{idx}
- Load the @code{SCM} object at word offset @var{idx} from local
- @var{obj}, and store it to @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} scm-set! s8:@var{dst} s8:@var{idx} s8:@var{obj}
- @deftypefnx Instruction {} scm-set!/immediate s8:@var{dst} c8:@var{idx} s8:@var{obj}
- Store the @code{scm} local @var{val} into object @var{obj} at word
- offset @var{idx}.
- @end deftypefn
- @deftypefn Instruction {} scm-ref/tag s8:@var{dst} s8:@var{obj} c8:@var{tag}
- Load the first word of @var{obj}, subtract the immediate @var{tag}, and store the
- resulting @code{SCM} to @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} scm-set!/tag s8:@var{obj} c8:@var{tag} s8:@var{val}
- Set the first word of @var{obj} to the unpacked bits of the @code{scm}
- value @var{val} plus the immediate value @var{tag}.
- @end deftypefn
- @deftypefn Instruction {} word-ref s8:@var{dst} s8:@var{obj} s8:@var{idx}
- @deftypefnx Instruction {} word-ref/immediate s8:@var{dst} s8:@var{obj} c8:@var{idx}
- Load the word at offset @var{idx} from local @var{obj}, and store it to
- the @code{u64} local @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} word-set! s8:@var{dst} s8:@var{idx} s8:@var{obj}
- @deftypefnx Instruction {} word-set!/immediate s8:@var{dst} c8:@var{idx} s8:@var{obj}
- Store the @code{u64} local @var{val} into object @var{obj} at word
- offset @var{idx}.
- @end deftypefn
- @deftypefn Instruction {} pointer-ref/immediate s8:@var{dst} s8:@var{obj} c8:@var{idx}
- Load the pointer at offset @var{idx} from local @var{obj}, and store it
- to the unboxed pointer local @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} pointer-set!/immediate s8:@var{dst} c8:@var{idx} s8:@var{obj}
- Store the unboxed pointer local @var{val} into object @var{obj} at word
- offset @var{idx}.
- @end deftypefn
- @deftypefn Instruction {} tail-pointer-ref/immediate s8:@var{dst} s8:@var{obj} c8:@var{idx}
- Compute the address of word offset @var{idx} from local @var{obj}, and store it
- to @var{dst}.
- @end deftypefn
- @node Atomic Memory Access Instructions
- @subsubsection Atomic Memory Access Instructions
- @deftypefn Instruction {} current-thread s24:@var{dst}
- Write the current thread into @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} atomic-scm-ref/immediate s8:@var{dst} s8:@var{obj} c8:@var{idx}
- Atomically load the @code{SCM} object at word offset @var{idx} from
- local @var{obj}, using the sequential consistency memory model. Store
- the result to @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} atomic-scm-set!/immediate s8:@var{obj} c8:@var{idx} s8:@var{val}
- Atomically set the @code{SCM} object at word offset @var{idx} from local
- @var{obj} to @var{val}, using the sequential consistency memory model.
- @end deftypefn
- @deftypefn Instruction {} atomic-scm-swap!/immediate s24:@var{dst} x8:@var{_} s24:@var{obj} c8:@var{idx} s24:@var{val}
- Atomically swap the @code{SCM} value stored in object @var{obj} at word
- offset @var{idx} with @var{val}, using the sequentially consistent
- memory model. Store the previous value to @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} atomic-scm-compare-and-swap!/immediate s24:@var{dst} x8:@var{_} s24:@var{obj} c8:@var{idx} s24:@var{expected} x8:@var{_} s24:@var{desired}
- Atomically swap the @code{SCM} value stored in object @var{obj} at word
- offset @var{idx} with @var{desired}, if and only if the value that was
- there was @var{expected}, using the sequentially consistent memory
- model. Store the value that was previously at @var{idx} from @var{obj}
- in @var{dst}.
- @end deftypefn
- @node Tagging and Untagging Instructions
- @subsubsection Tagging and Untagging Instructions
- @deftypefn Instruction {} tag-char s12:@var{dst} s12:@var{src}
- Make a @code{SCM} character whose integer value is the @code{u64} in
- @var{src}, and store it in @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} untag-char s12:@var{dst} s12:@var{src}
- Extract the integer value from the @code{SCM} character @var{src}, and
- store the resulting @code{u64} in @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} tag-fixnum s12:@var{dst} s12:@var{src}
- Make a @code{SCM} integer whose value is the @code{s64} in @var{src},
- and store it in @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} untag-fixnum s12:@var{dst} s12:@var{src}
- Extract the integer value from the @code{SCM} integer @var{src}, and
- store the resulting @code{s64} in @var{dst}.
- @end deftypefn
- @node Integer Arithmetic Instructions
- @subsubsection Integer Arithmetic Instructions
- @deftypefn Instruction {} uadd s8:@var{dst} s8:@var{a} s8:@var{b}
- @deftypefnx Instruction {} uadd/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
- Add the @code{u64} values @var{a} and @var{b}, and store the @code{u64}
- result to @var{dst}. Overflow will wrap.
- @end deftypefn
- @deftypefn Instruction {} usub s8:@var{dst} s8:@var{a} s8:@var{b}
- @deftypefnx Instruction {} usub/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
- Subtract the @code{u64} value @var{b} from @var{a}, and store the
- @code{u64} result to @var{dst}. Underflow will wrap.
- @end deftypefn
- @deftypefn Instruction {} umul s8:@var{dst} s8:@var{a} s8:@var{b}
- @deftypefnx Instruction {} umul/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
- Multiply the @code{u64} values @var{a} and @var{b}, and store the
- @code{u64} result to @var{dst}. Overflow will wrap.
- @end deftypefn
- @deftypefn Instruction {} ulogand s8:@var{dst} s8:@var{a} s8:@var{b}
- Place the bitwise @code{and} of the @code{u64} values @var{a} and
- @var{b} into the @code{u64} local @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} ulogior s8:@var{dst} s8:@var{a} s8:@var{b}
- Place the bitwise inclusive @code{or} of the @code{u64} values @var{a}
- and @var{b} into the @code{u64} local @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} ulogxor s8:@var{dst} s8:@var{a} s8:@var{b}
- Place the bitwise exclusive @code{or} of the @code{u64} values @var{a}
- and @var{b} into the @code{u64} local @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} ulogsub s8:@var{dst} s8:@var{a} s8:@var{b}
- Place the bitwise @code{and} of the @code{u64} values @var{a} and the
- bitwise @code{not} of @var{b} into the @code{u64} local @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} ulsh s8:@var{dst} s8:@var{a} s8:@var{b}
- @deftypefnx Instruction {} ulsh/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
- Shift the unboxed unsigned 64-bit integer in @var{a} left by @var{b}
- bits, also an unboxed unsigned 64-bit integer. Truncate to 64 bits and
- write to @var{dst} as an unboxed value. Only the lower 6 bits of
- @var{b} are used.
- @end deftypefn
- @deftypefn Instruction {} ursh s8:@var{dst} s8:@var{a} s8:@var{b}
- @deftypefnx Instruction {} ursh/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
- Shift the unboxed unsigned 64-bit integer in @var{a} right by @var{b}
- bits, also an unboxed unsigned 64-bit integer. Truncate to 64 bits and
- write to @var{dst} as an unboxed value. Only the lower 6 bits of
- @var{b} are used.
- @end deftypefn
- @deftypefn Instruction {} srsh s8:@var{dst} s8:@var{a} s8:@var{b}
- @deftypefnx Instruction {} srsh/immediate s8:@var{dst} s8:@var{a} c8:@var{b}
- Shift the unboxed signed 64-bit integer in @var{a} right by @var{b}
- bits, also an unboxed signed 64-bit integer. Truncate to 64 bits and
- write to @var{dst} as an unboxed value. Only the lower 6 bits of
- @var{b} are used.
- @end deftypefn
- @node Floating-Point Arithmetic Instructions
- @subsubsection Floating-Point Arithmetic Instructions
- @deftypefn Instruction {} fadd s8:@var{dst} s8:@var{a} s8:@var{b}
- Add the @code{f64} values @var{a} and @var{b}, and store the @code{f64}
- result to @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} fsub s8:@var{dst} s8:@var{a} s8:@var{b}
- Subtract the @code{f64} value @var{b} from @var{a}, and store the
- @code{f64} result to @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} fmul s8:@var{dst} s8:@var{a} s8:@var{b}
- Multiply the @code{f64} values @var{a} and @var{b}, and store the
- @code{f64} result to @var{dst}.
- @end deftypefn
- @deftypefn Instruction {} fdiv s8:@var{dst} s8:@var{a} s8:@var{b}
- Divide the @code{f64} values @var{a} by @var{b}, and store the
- @code{f64} result to @var{dst}.
- @end deftypefn
- @node Comparison Instructions
- @subsubsection Comparison Instructions
- @deftypefn Instruction {} u64=? s12:@var{a} s12:@var{b}
- Set the comparison result to @var{EQUAL} if the @code{u64} values
- @var{a} and @var{b} are the same, or @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} u64<? s12:@var{a} s12:@var{b}
- Set the comparison result to @code{LESS_THAN} if the @code{u64} value
- @var{a} is less than the @code{u64} value @var{b} are the same, or
- @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} s64<? s12:@var{a} s12:@var{b}
- Set the comparison result to @code{LESS_THAN} if the @code{s64} value
- @var{a} is less than the @code{s64} value @var{b} are the same, or
- @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} s64-imm=? s12:@var{a} z12:@var{b}
- Set the comparison result to @var{EQUAL} if the @code{s64} value @var{a}
- is equal to the immediate @code{s64} value @var{b}, or @code{NONE}
- otherwise.
- @end deftypefn
- @deftypefn Instruction {} u64-imm<? s12:@var{a} c12:@var{b}
- Set the comparison result to @code{LESS_THAN} if the @code{u64} value
- @var{a} is less than the immediate @code{u64} value @var{b}, or
- @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} imm-u64<? s12:@var{a} s12:@var{b}
- Set the comparison result to @code{LESS_THAN} if the @code{u64}
- immediate @var{b} is less than the @code{u64} value @var{a}, or
- @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} s64-imm<? s12:@var{a} z12:@var{b}
- Set the comparison result to @code{LESS_THAN} if the @code{s64} value
- @var{a} is less than the immediate @code{s64} value @var{b}, or
- @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} imm-s64<? s12:@var{a} z12:@var{b}
- Set the comparison result to @code{LESS_THAN} if the @code{s64}
- immediate @var{b} is less than the @code{s64} value @var{a}, or
- @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} f64=? s12:@var{a} s12:@var{b}
- Set the comparison result to @var{EQUAL} if the f64 value @var{a} is
- equal to the f64 value @var{b}, or @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} f64<? s12:@var{a} s12:@var{b}
- Set the comparison result to @code{LESS_THAN} if the f64 value @var{a}
- is less than the f64 value @var{b}, @code{NONE} if @var{a} is greater
- than or equal to @var{b}, or @code{INVALID} otherwise.
- @end deftypefn
- @deftypefn Instruction {} =? s12:@var{a} s12:@var{b}
- Set the comparison result to @var{EQUAL} if the SCM values @var{a} and
- @var{b} are numerically equal, in the sense of the Scheme @code{=}
- operator. Set to @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} heap-numbers-equal? s12:@var{a} s12:@var{b}
- Set the comparison result to @var{EQUAL} if the SCM values @var{a} and
- @var{b} are numerically equal, in the sense of Scheme @code{=}. Set to
- @code{NONE} otherwise. It is known that both @var{a} and @var{b} are
- heap numbers.
- @end deftypefn
- @deftypefn Instruction {} <? s12:@var{a} s12:@var{b}
- Set the comparison result to @code{LESS_THAN} if the SCM value @var{a}
- is less than the SCM value @var{b}, @code{NONE} if @var{a} is greater
- than or equal to @var{b}, or @code{INVALID} otherwise.
- @end deftypefn
- @deftypefn Instruction {} immediate-tag=? s24:@var{obj} c16:@var{mask} c16:@var{tag}
- Set the comparison result to @var{EQUAL} if the result of a bitwise
- @code{and} between the bits of @code{scm} value @var{a} and the
- immediate @var{mask} is @var{tag}, or @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} heap-tag=? s24:@var{obj} c16:@var{mask} c16:@var{tag}
- Set the comparison result to @var{EQUAL} if the result of a bitwise
- @code{and} between the first word of @code{scm} value @var{a} and the
- immediate @var{mask} is @var{tag}, or @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} eq? s12:@var{a} s12:@var{b}
- Set the comparison result to @var{EQUAL} if the SCM values @var{a} and
- @var{b} are @code{eq?}, or @code{NONE} otherwise.
- @end deftypefn
- @deftypefn Instruction {} eq-immediate? s8:@var{a} zi16:@var{b}
- Set the comparison result to @var{EQUAL} if the SCM value @var{a} is
- equal to the immediate SCM value @var{b} (sign-extended), or @code{NONE}
- otherwise.
- @end deftypefn
- There are a set of macro-instructions for @code{immediate-tag=?} and
- @code{heap-tag=?} as well that abstract away the precise type tag
- values. @xref{The SCM Type in Guile}.
- @deffn {Macro Instruction} fixnum? x
- @deffnx {Macro Instruction} heap-object? x
- @deffnx {Macro Instruction} char? x
- @deffnx {Macro Instruction} eq-false? x
- @deffnx {Macro Instruction} eq-nil? x
- @deffnx {Macro Instruction} eq-null? x
- @deffnx {Macro Instruction} eq-true? x
- @deffnx {Macro Instruction} unspecified? x
- @deffnx {Macro Instruction} undefined? x
- @deffnx {Macro Instruction} eof-object? x
- @deffnx {Macro Instruction} null? x
- @deffnx {Macro Instruction} false? x
- @deffnx {Macro Instruction} nil? x
- Emit a @code{immediate-tag=?} instruction that will set the comparison
- result to @code{EQUAL} if @var{x} would pass the corresponding predicate
- (e.g. @code{null?}), or @code{NONE} otherwise.
- @end deffn
- @deffn {Macro Instruction} pair? x
- @deffnx {Macro Instruction} struct? x
- @deffnx {Macro Instruction} symbol? x
- @deffnx {Macro Instruction} variable? x
- @deffnx {Macro Instruction} vector? x
- @deffnx {Macro Instruction} immutable-vector? x
- @deffnx {Macro Instruction} mutable-vector? x
- @deffnx {Macro Instruction} weak-vector? x
- @deffnx {Macro Instruction} string? x
- @deffnx {Macro Instruction} heap-number? x
- @deffnx {Macro Instruction} hash-table? x
- @deffnx {Macro Instruction} pointer? x
- @deffnx {Macro Instruction} fluid? x
- @deffnx {Macro Instruction} stringbuf? x
- @deffnx {Macro Instruction} dynamic-state? x
- @deffnx {Macro Instruction} frame? x
- @deffnx {Macro Instruction} keyword? x
- @deffnx {Macro Instruction} atomic-box? x
- @deffnx {Macro Instruction} syntax? x
- @deffnx {Macro Instruction} program? x
- @deffnx {Macro Instruction} vm-continuation? x
- @deffnx {Macro Instruction} bytevector? x
- @deffnx {Macro Instruction} weak-set? x
- @deffnx {Macro Instruction} weak-table? x
- @deffnx {Macro Instruction} array? x
- @deffnx {Macro Instruction} bitvector? x
- @deffnx {Macro Instruction} smob? x
- @deffnx {Macro Instruction} port? x
- @deffnx {Macro Instruction} bignum? x
- @deffnx {Macro Instruction} flonum? x
- @deffnx {Macro Instruction} compnum? x
- @deffnx {Macro Instruction} fracnum? x
- Emit a @code{heap-tag=?} instruction that will set the comparison result
- to @code{EQUAL} if @var{x} would pass the corresponding predicate
- (e.g. @code{null?}), or @code{NONE} otherwise.
- @end deffn
- @node Branch Instructions
- @subsubsection Branch Instructions
- All offsets to branch instructions are 24-bit signed numbers, which
- count 32-bit units. This gives Guile effectively a 26-bit address range
- for relative jumps.
- @deftypefn Instruction {} j l24:@var{offset}
- Add @var{offset} to the current instruction pointer.
- @end deftypefn
- @deftypefn Instruction {} jl l24:@var{offset}
- If the last comparison result is @code{LESS_THAN}, add @var{offset}, a
- signed 24-bit number, to the current instruction pointer.
- @end deftypefn
- @deftypefn Instruction {} je l24:@var{offset}
- If the last comparison result is @code{EQUAL}, add @var{offset}, a
- signed 24-bit number, to the current instruction pointer.
- @end deftypefn
- @deftypefn Instruction {} jnl l24:@var{offset}
- If the last comparison result is not @code{LESS_THAN}, add @var{offset},
- a signed 24-bit number, to the current instruction pointer.
- @end deftypefn
- @deftypefn Instruction {} jne l24:@var{offset}
- If the last comparison result is not @code{EQUAL}, add @var{offset}, a
- signed 24-bit number, to the current instruction pointer.
- @end deftypefn
- @deftypefn Instruction {} jge l24:@var{offset}
- If the last comparison result is @code{NONE}, add @var{offset}, a
- signed 24-bit number, to the current instruction pointer.
- This is intended for use after a @code{<?} comparison, and is different
- from @code{jnl} in the way it handles not-a-number (NaN) values:
- @code{<?} sets @code{INVALID} instead of @code{NONE} if either value is
- a NaN. For exact numbers, @code{jge} is the same as @code{jnl}.
- @end deftypefn
- @deftypefn Instruction {} jnge l24:@var{offset}
- If the last comparison result is not @code{NONE}, add @var{offset}, a
- signed 24-bit number, to the current instruction pointer.
- This is intended for use after a @code{<?} comparison, and is different
- from @code{jl} in the way it handles not-a-number (NaN) values:
- @code{<?} sets @code{INVALID} instead of @code{NONE} if either value is
- a NaN. For exact numbers, @code{jnge} is the same as @code{jl}.
- @end deftypefn
- @deftypefn Instruction {} jtable s24:@var{idx} v32:@var{length} [x8:_ l24:@var{offset}]...
- Branch to an entry in a table, as in C's @code{switch} statement.
- @var{idx} is a @code{u64} local indicating which entry to branch to.
- The immediate @var{len} indicates the number of entries in the table,
- and should be greater than or equal to 1. The last entry in the table
- is the "catch-all" entry. The @var{offset}... values are signed 24-bit
- immediates (@code{l24} encoding), indicating a memory address as a
- number of 32-bit words away from the current instruction pointer.
- @end deftypefn
- @node Raw Memory Access Instructions
- @subsubsection Raw Memory Access Instructions
- Bytevector operations correspond closely to what the current hardware
- can do, so it makes sense to inline them to VM instructions, providing
- a clear path for eventual native compilation. Without this, Scheme
- programs would need other primitives for accessing raw bytes -- but
- these primitives are as good as any.
- @deftypefn Instruction {} u8-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
- @deftypefnx Instruction {} s8-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
- @deftypefnx Instruction {} u16-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
- @deftypefnx Instruction {} s16-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
- @deftypefnx Instruction {} u32-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
- @deftypefnx Instruction {} s32-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
- @deftypefnx Instruction {} u64-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
- @deftypefnx Instruction {} s64-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
- @deftypefnx Instruction {} f32-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
- @deftypefnx Instruction {} f64-ref s8:@var{dst} s8:@var{ptr} s8:@var{idx}
- Fetch the item at byte offset @var{idx} from the raw pointer local
- @var{ptr}, and store it in @var{dst}. All accesses use native
- endianness.
- The @var{idx} value should be an unboxed unsigned 64-bit integer.
- The results are all written to the stack as unboxed values, either as
- signed 64-bit integers, unsigned 64-bit integers, or IEEE double
- floating point numbers.
- @end deftypefn
- @deftypefn Instruction {} u8-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
- @deftypefnx Instruction {} s8-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
- @deftypefnx Instruction {} u16-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
- @deftypefnx Instruction {} s16-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
- @deftypefnx Instruction {} u32-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
- @deftypefnx Instruction {} s32-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
- @deftypefnx Instruction {} u64-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
- @deftypefnx Instruction {} s64-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
- @deftypefnx Instruction {} f32-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
- @deftypefnx Instruction {} f64-set! s8:@var{ptr} s8:@var{idx} s8:@var{val}
- Store @var{val} into memory pointed to by raw pointer local @var{ptr},
- at byte offset @var{idx}. Multibyte values are written using native
- endianness.
- The @var{idx} value should be an unboxed unsigned 64-bit integer.
- The @var{val} values are all unboxed, either as signed 64-bit integers,
- unsigned 64-bit integers, or IEEE double floating point numbers.
- @end deftypefn
- @node Just-In-Time Native Code
- @subsection Just-In-Time Native Code
- @cindex just-in-time compiler
- @cindex jit compiler
- @cindex template jit
- @cindex compiler, just-in-time
- The final piece of Guile's virtual machine is a just-in-time (JIT)
- compiler from bytecode instructions to native code. It is faster to run
- a function when its bytecode instructions are compiled to native code,
- compared to having the VM interpret the instructions.
- The JIT compiler runs automatically, triggered by counters associated
- with each function. The counter increments when functions are called
- and during each loop iteration. Once a function's counter passes a
- certain value, the function gets JIT-compiled. @xref{Instrumentation
- Instructions}, for full details.
- Guile's JIT compiler is what is known as a @dfn{template JIT}. This
- kind of JIT is very simple: for each instruction in a function, the JIT
- compiler will emit a generic sequence of machine code corresponding to
- the instruction kind, specializing that generic template to reference
- the specific operands of the instruction being compiled.
- The strength of a template JIT is principally that it is very fast at
- emitting code. It doesn't need to do any time-consuming analysis on the
- bytecode that it is compiling to do its job.
- A template JIT is also very predictable: the native code emitted by a
- template JIT has the same performance characteristics of the
- corresponding bytecode, only that it runs faster. In theory you could
- even generate the template-JIT machine code ahead of time, as it doesn't
- depend on any value seen at run-time.
- This predictability makes it possible to reason about the performance of
- a system in terms of bytecode, knowing that the conclusions apply to
- native code emitted by a template JIT.
- Because the machine code corresponding to an instruction always performs
- the same tasks that the interpreter would do for that instruction,
- bytecode and a template JIT also allows Guile programmers to debug their
- programs in terms of the bytecode model. When a Guile programmer sets a
- breakpoint, Guile will disable the JIT for the thread being debugged,
- falling back to the interpreter (which has the corresponding code to run
- the hooks). @xref{VM Hooks}.
- To emit native code, Guile uses a forked version of GNU Lightning. This
- "Lightening" effort, spun out as a separate project, aims to build on
- the back-end support from GNU Lightning, but adapting the API and
- behavior of the library to match Guile's needs. This code is included
- in the Guile source distribution. For more information, see
- @url{https://gitlab.com/wingo/lightening}. As of mid-2019, Lightening
- supports code generation for the x86-64, ia32, ARMv7, and AArch64
- architectures.
- The weaknesses of a template JIT are two-fold. Firstly, as a simple
- back-end that has to run fast, a template JIT doesn't have time to do
- analysis that could help it generate better code, notably global
- register allocation and instruction selection.
- However this is a minor weakness compared to the inability to perform
- significant, speculative program transformations. For example, Guile
- could see that in an expression @code{(f x)}, that in practice @var{f}
- always refers to the same function. An advanced JIT compiler would
- speculatively inline @var{f} into the call-site, along with a dynamic
- check to make sure that the assertion still held. But as a template JIT
- doesn't pay attention to values only known at run-time, it can't make
- this transformation.
- This limitation is mitigated in part by Guile's robust ahead-of-time
- compiler which can already perform significant optimizations when it can
- prove they will always be valid, and its low-level bytecode which is
- able to represent the effect of those optimizations (e.g. elided
- type-checks). @xref{Compiling to the Virtual Machine}, for more on
- Guile's compiler.
- An ahead-of-time Scheme-to-bytecode strategy, complemented by a template
- JIT, also particularly suits the somewhat static nature of Scheme.
- Scheme programmers often write code in a way that makes the identity of
- free variable references lexically apparent. For example, the @code{(f
- x)} expression could appear within a @code{(let ((f (lambda (x) (1+
- x)))) ...)} expression, or we could see that @code{f} was imported from
- a particular module where we know its binding. Ahead-of-time
- compilation techniques can work well for a language like Scheme where
- there is little polymorphism and much first-order programming. They do
- not work so well for a language like JavaScript, which is highly mutable
- at run-time and difficult to analyze due to method calls (which are
- effectively higher-order calls).
- All that said, a template JIT works well for Guile at this point. It's
- only a few thousand lines of maintainable code, it speeds up Scheme
- programs, and it keeps the bulk of the Guile Scheme implementation
- written in Scheme itself. The next step is probably to add
- ahead-of-time native code emission to the back-end of the compiler
- written in Scheme, to take advantage of the opportunity to do global
- register allocation and instruction selection. Once this is working, it
- can allow Guile to experiment with speculative optimizations in Scheme
- as well. @xref{Extending the Compiler}, for more on future directions.
- Finally, note that there are a few environment variables that can be
- tweaked to make JIT compilation happen sooner, later, or never.
- @xref{Environment Variables}, for more.
|