123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700 |
- @c -*-texinfo-*-
- @c This is part of the GNU Guile Reference Manual.
- @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005
- @c Free Software Foundation, Inc.
- @c See the file guile.texi for copying conditions.
- @node Defining New Types (Smobs)
- @section Defining New Types (Smobs)
- @dfn{Smobs} are Guile's mechanism for adding new primitive types to
- the system. The term ``smob'' was coined by Aubrey Jaffer, who says
- it comes from ``small object'', referring to the fact that they are
- quite limited in size: they can hold just one pointer to a larger
- memory block plus 16 extra bits.
- To define a new smob type, the programmer provides Guile with some
- essential information about the type --- how to print it, how to
- garbage collect it, and so on --- and Guile allocates a fresh type tag
- for it. The programmer can then use @code{scm_c_define_gsubr} to make
- a set of C functions visible to Scheme code that create and operate on
- these objects.
- (You can find a complete version of the example code used in this
- section in the Guile distribution, in @file{doc/example-smob}. That
- directory includes a makefile and a suitable @code{main} function, so
- you can build a complete interactive Guile shell, extended with the
- datatypes described here.)
- @menu
- * Describing a New Type::
- * Creating Instances::
- * Type checking::
- * Garbage Collecting Smobs::
- * Garbage Collecting Simple Smobs::
- * Remembering During Operations::
- * Double Smobs::
- * The Complete Example::
- @end menu
- @node Describing a New Type
- @subsection Describing a New Type
- To define a new type, the programmer must write four functions to
- manage instances of the type:
- @table @code
- @item mark
- Guile will apply this function to each instance of the new type it
- encounters during garbage collection. This function is responsible for
- telling the collector about any other @code{SCM} values that the object
- has stored. The default smob mark function does nothing.
- @xref{Garbage Collecting Smobs}, for more details.
- @item free
- Guile will apply this function to each instance of the new type that is
- to be deallocated. The function should release all resources held by
- the object. This is analogous to the Java finalization method-- it is
- invoked at an unspecified time (when garbage collection occurs) after
- the object is dead. The default free function frees the smob data (if
- the size of the struct passed to @code{scm_make_smob_type} is non-zero)
- using @code{scm_gc_free}. @xref{Garbage Collecting Smobs}, for more
- details.
- This function operates while the heap is in an inconsistent state and
- must therefore be careful. @xref{Smobs}, for details about what this
- function is allowed to do.
- @item print
- Guile will apply this function to each instance of the new type to print
- the value, as for @code{display} or @code{write}. The default print
- function prints @code{#<NAME ADDRESS>} where @code{NAME} is the first
- argument passed to @code{scm_make_smob_type}. For more information on
- printing, see @ref{Port Data}.
- @item equalp
- If Scheme code asks the @code{equal?} function to compare two instances
- of the same smob type, Guile calls this function. It should return
- @code{SCM_BOOL_T} if @var{a} and @var{b} should be considered
- @code{equal?}, or @code{SCM_BOOL_F} otherwise. If @code{equalp} is
- @code{NULL}, @code{equal?} will assume that two instances of this type are
- never @code{equal?} unless they are @code{eq?}.
- @end table
- To actually register the new smob type, call @code{scm_make_smob_type}.
- It returns a value of type @code{scm_t_bits} which identifies the new
- smob type.
- The four special functions described above are registered by calling
- one of @code{scm_set_smob_mark}, @code{scm_set_smob_free},
- @code{scm_set_smob_print}, or @code{scm_set_smob_equalp}, as
- appropriate. Each function is intended to be used at most once per
- type, and the call should be placed immediately following the call to
- @code{scm_make_smob_type}.
- There can only be at most 256 different smob types in the system.
- Instead of registering a huge number of smob types (for example, one
- for each relevant C struct in your application), it is sometimes
- better to register just one and implement a second layer of type
- dispatching on top of it. This second layer might use the 16 extra
- bits to extend its type, for example.
- Here is how one might declare and register a new type representing
- eight-bit gray-scale images:
- @example
- #include <libguile.h>
- struct image @{
- int width, height;
- char *pixels;
- /* The name of this image */
- SCM name;
- /* A function to call when this image is
- modified, e.g., to update the screen,
- or SCM_BOOL_F if no action necessary */
- SCM update_func;
- @};
- static scm_t_bits image_tag;
- void
- init_image_type (void)
- @{
- image_tag = scm_make_smob_type ("image", sizeof (struct image));
- scm_set_smob_mark (image_tag, mark_image);
- scm_set_smob_free (image_tag, free_image);
- scm_set_smob_print (image_tag, print_image);
- @}
- @end example
- @node Creating Instances
- @subsection Creating Instances
- Normally, smobs can have one @emph{immediate} word of data. This word
- stores either a pointer to an additional memory block that holds the
- real data, or it might hold the data itself when it fits. The word is
- large enough for a @code{SCM} value, a pointer to @code{void}, or an
- integer that fits into a @code{size_t} or @code{ssize_t}.
- You can also create smobs that have two or three immediate words, and
- when these words suffice to store all data, it is more efficient to use
- these super-sized smobs instead of using a normal smob plus a memory
- block. @xref{Double Smobs}, for their discussion.
- Guile provides functions for managing memory which are often helpful
- when implementing smobs. @xref{Memory Blocks}.
- To retrieve the immediate word of a smob, you use the macro
- @code{SCM_SMOB_DATA}. It can be set with @code{SCM_SET_SMOB_DATA}.
- The 16 extra bits can be accessed with @code{SCM_SMOB_FLAGS} and
- @code{SCM_SET_SMOB_FLAGS}.
- The two macros @code{SCM_SMOB_DATA} and @code{SCM_SET_SMOB_DATA} treat
- the immediate word as if it were of type @code{scm_t_bits}, which is
- an unsigned integer type large enough to hold a pointer to
- @code{void}. Thus you can use these macros to store arbitrary
- pointers in the smob word.
- When you want to store a @code{SCM} value directly in the immediate
- word of a smob, you should use the macros @code{SCM_SMOB_OBJECT} and
- @code{SCM_SET_SMOB_OBJECT} to access it.
- Creating a smob instance can be tricky when it consists of multiple
- steps that allocate resources and might fail. It is recommended that
- you go about creating a smob in the following way:
- @itemize
- @item
- Allocate the memory block for holding the data with
- @code{scm_gc_malloc}.
- @item
- Initialize it to a valid state without calling any functions that might
- cause a non-local exits. For example, initialize pointers to NULL.
- Also, do not store @code{SCM} values in it that must be protected.
- Initialize these fields with @code{SCM_BOOL_F}.
- A valid state is one that can be safely acted upon by the @emph{mark}
- and @emph{free} functions of your smob type.
- @item
- Create the smob using @code{SCM_NEWSMOB}, passing it the initialized
- memory block. (This step will always succeed.)
- @item
- Complete the initialization of the memory block by, for example,
- allocating additional resources and making it point to them.
- @end itemize
- This procedure ensures that the smob is in a valid state as soon as it
- exists, that all resources that are allocated for the smob are
- properly associated with it so that they can be properly freed, and
- that no @code{SCM} values that need to be protected are stored in it
- while the smob does not yet competely exist and thus can not protect
- them.
- Continuing the example from above, if the global variable
- @code{image_tag} contains a tag returned by @code{scm_make_smob_type},
- here is how we could construct a smob whose immediate word contains a
- pointer to a freshly allocated @code{struct image}:
- @example
- SCM
- make_image (SCM name, SCM s_width, SCM s_height)
- @{
- SCM smob;
- struct image *image;
- int width = scm_to_int (s_width);
- int height = scm_to_int (s_height);
- /* Step 1: Allocate the memory block.
- */
- image = (struct image *) scm_gc_malloc (sizeof (struct image), "image");
- /* Step 2: Initialize it with straight code.
- */
- image->width = width;
- image->height = height;
- image->pixels = NULL;
- image->name = SCM_BOOL_F;
- image->update_func = SCM_BOOL_F;
- /* Step 3: Create the smob.
- */
- SCM_NEWSMOB (smob, image_tag, image);
- /* Step 4: Finish the initialization.
- */
- image->name = name;
- image->pixels = scm_gc_malloc (width * height, "image pixels");
- return smob;
- @}
- @end example
- Let us look at what might happen when @code{make_image} is called.
- The conversions of @var{s_width} and @var{s_height} to @code{int}s might
- fail and signal an error, thus causing a non-local exit. This is not a
- problem since no resources have been allocated yet that would have to be
- freed.
- The allocation of @var{image} in step 1 might fail, but this is likewise
- no problem.
- Step 2 can not exit non-locally. At the end of it, the @var{image}
- struct is in a valid state for the @code{mark_image} and
- @code{free_image} functions (see below).
- Step 3 can not exit non-locally either. This is guaranteed by Guile.
- After it, @var{smob} contains a valid smob that is properly initialized
- and protected, and in turn can properly protect the Scheme values in its
- @var{image} struct.
- But before the smob is completely created, @code{SCM_NEWSMOB} might
- cause the garbage collector to run. During this garbage collection, the
- @code{SCM} values in the @var{image} struct would be invisible to Guile.
- It only gets to know about them via the @code{mark_image} function, but
- that function can not yet do its job since the smob has not been created
- yet. Thus, it is important to not store @code{SCM} values in the
- @var{image} struct until after the smob has been created.
- Step 4, finally, might fail and cause a non-local exit. In that case,
- the complete creation of the smob has not been successful, but it does
- nevertheless exist in a valid state. It will eventually be freed by
- the garbage collector, and all the resources that have been allocated
- for it will be correctly freed by @code{free_image}.
- @node Type checking
- @subsection Type checking
- Functions that operate on smobs should check that the passed
- @code{SCM} value indeed is a suitable smob before accessing its data.
- They can do this with @code{scm_assert_smob_type}.
- For example, here is a simple function that operates on an image smob,
- and checks the type of its argument.
- @example
- SCM
- clear_image (SCM image_smob)
- @{
- int area;
- struct image *image;
- scm_assert_smob_type (image_tag, image_smob);
- image = (struct image *) SCM_SMOB_DATA (image_smob);
- area = image->width * image->height;
- memset (image->pixels, 0, area);
- /* Invoke the image's update function.
- */
- if (scm_is_true (image->update_func))
- scm_call_0 (image->update_func);
- scm_remember_upto_here_1 (image_smob);
- return SCM_UNSPECIFIED;
- @}
- @end example
- See @ref{Remembering During Operations} for an explanation of the call
- to @code{scm_remember_upto_here_1}.
- @node Garbage Collecting Smobs
- @subsection Garbage Collecting Smobs
- Once a smob has been released to the tender mercies of the Scheme
- system, it must be prepared to survive garbage collection. Guile calls
- the @emph{mark} and @emph{free} functions of the smob to manage this.
- As described in more detail elsewhere (@pxref{Conservative GC}), every
- object in the Scheme system has a @dfn{mark bit}, which the garbage
- collector uses to tell live objects from dead ones. When collection
- starts, every object's mark bit is clear. The collector traces pointers
- through the heap, starting from objects known to be live, and sets the
- mark bit on each object it encounters. When it can find no more
- unmarked objects, the collector walks all objects, live and dead, frees
- those whose mark bits are still clear, and clears the mark bit on the
- others.
- The two main portions of the collection are called the @dfn{mark phase},
- during which the collector marks live objects, and the @dfn{sweep
- phase}, during which the collector frees all unmarked objects.
- The mark bit of a smob lives in a special memory region. When the
- collector encounters a smob, it sets the smob's mark bit, and uses the
- smob's type tag to find the appropriate @emph{mark} function for that
- smob. It then calls this @emph{mark} function, passing it the smob as
- its only argument.
- The @emph{mark} function is responsible for marking any other Scheme
- objects the smob refers to. If it does not do so, the objects' mark
- bits will still be clear when the collector begins to sweep, and the
- collector will free them. If this occurs, it will probably break, or at
- least confuse, any code operating on the smob; the smob's @code{SCM}
- values will have become dangling references.
- To mark an arbitrary Scheme object, the @emph{mark} function calls
- @code{scm_gc_mark}.
- Thus, here is how we might write @code{mark_image}:
- @example
- @group
- SCM
- mark_image (SCM image_smob)
- @{
- /* Mark the image's name and update function. */
- struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
- scm_gc_mark (image->name);
- scm_gc_mark (image->update_func);
- return SCM_BOOL_F;
- @}
- @end group
- @end example
- Note that, even though the image's @code{update_func} could be an
- arbitrarily complex structure (representing a procedure and any values
- enclosed in its environment), @code{scm_gc_mark} will recurse as
- necessary to mark all its components. Because @code{scm_gc_mark} sets
- an object's mark bit before it recurses, it is not confused by
- circular structures.
- As an optimization, the collector will mark whatever value is returned
- by the @emph{mark} function; this helps limit depth of recursion during
- the mark phase. Thus, the code above should really be written as:
- @example
- @group
- SCM
- mark_image (SCM image_smob)
- @{
- /* Mark the image's name and update function. */
- struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
- scm_gc_mark (image->name);
- return image->update_func;
- @}
- @end group
- @end example
- Finally, when the collector encounters an unmarked smob during the sweep
- phase, it uses the smob's tag to find the appropriate @emph{free}
- function for the smob. It then calls that function, passing it the smob
- as its only argument.
- The @emph{free} function must release any resources used by the smob.
- However, it must not free objects managed by the collector; the
- collector will take care of them. For historical reasons, the return
- type of the @emph{free} function should be @code{size_t}, an unsigned
- integral type; the @emph{free} function should always return zero.
- Here is how we might write the @code{free_image} function for the image
- smob type:
- @example
- size_t
- free_image (SCM image_smob)
- @{
- struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
- scm_gc_free (image->pixels, image->width * image->height, "image pixels");
- scm_gc_free (image, sizeof (struct image), "image");
- return 0;
- @}
- @end example
- During the sweep phase, the garbage collector will clear the mark bits
- on all live objects. The code which implements a smob need not do this
- itself.
- There is no way for smob code to be notified when collection is
- complete.
- It is usually a good idea to minimize the amount of processing done
- during garbage collection; keep the @emph{mark} and @emph{free}
- functions very simple. Since collections occur at unpredictable times,
- it is easy for any unusual activity to interfere with normal code.
- @node Garbage Collecting Simple Smobs
- @subsection Garbage Collecting Simple Smobs
- It is often useful to define very simple smob types --- smobs which have
- no data to mark, other than the cell itself, or smobs whose immediate
- data word is simply an ordinary Scheme object, to be marked recursively.
- Guile provides some functions to handle these common cases; you can use
- this function as your smob type's @emph{mark} function, if your smob's
- structure is simple enough.
- If the smob refers to no other Scheme objects, then no action is
- necessary; the garbage collector has already marked the smob cell
- itself. In that case, you can use zero as your mark function.
- If the smob refers to exactly one other Scheme object via its first
- immediate word, you can use @code{scm_markcdr} as its mark function.
- Its definition is simply:
- @smallexample
- SCM
- scm_markcdr (SCM obj)
- @{
- return SCM_SMOB_OBJECT (obj);
- @}
- @end smallexample
- @node Remembering During Operations
- @subsection Remembering During Operations
- @cindex remembering
- It's important that a smob is visible to the garbage collector
- whenever its contents are being accessed. Otherwise it could be freed
- while code is still using it.
- For example, consider a procedure to convert image data to a list of
- pixel values.
- @example
- SCM
- image_to_list (SCM image_smob)
- @{
- struct image *image;
- SCM lst;
- int i;
- scm_assert_smob_type (image_tag, image_smob);
- image = (struct image *) SCM_SMOB_DATA (image_smob);
- lst = SCM_EOL;
- for (i = image->width * image->height - 1; i >= 0; i--)
- lst = scm_cons (scm_from_char (image->pixels[i]), lst);
- scm_remember_upto_here_1 (image_smob);
- return lst;
- @}
- @end example
- In the loop, only the @code{image} pointer is used and the C compiler
- has no reason to keep the @code{image_smob} value anywhere. If
- @code{scm_cons} results in a garbage collection, @code{image_smob} might
- not be on the stack or anywhere else and could be freed, leaving the
- loop accessing freed data. The use of @code{scm_remember_upto_here_1}
- prevents this, by creating a reference to @code{image_smob} after all
- data accesses.
- There's no need to do the same for @code{lst}, since that's the return
- value and the compiler will certainly keep it in a register or
- somewhere throughout the routine.
- The @code{clear_image} example previously shown (@pxref{Type checking})
- also used @code{scm_remember_upto_here_1} for this reason.
- It's only in quite rare circumstances that a missing
- @code{scm_remember_upto_here_1} will bite, but when it happens the
- consequences are serious. Fortunately the rule is simple: whenever
- calling a Guile library function or doing something that might, ensure
- that the @code{SCM} of a smob is referenced past all accesses to its
- insides. Do this by adding an @code{scm_remember_upto_here_1} if
- there are no other references.
- In a multi-threaded program, the rule is the same. As far as a given
- thread is concerned, a garbage collection still only occurs within a
- Guile library function, not at an arbitrary time. (Guile waits for all
- threads to reach one of its library functions, and holds them there
- while the collector runs.)
- @node Double Smobs
- @subsection Double Smobs
- Smobs are called smob because they are small: they normally have only
- room for one @code{void*} or @code{SCM} value plus 16 bits. The
- reason for this is that smobs are directly implemented by using the
- low-level, two-word cells of Guile that are also used to implement
- pairs, for example. (@pxref{Data Representation} for the details.)
- One word of the two-word cells is used for @code{SCM_SMOB_DATA} (or
- @code{SCM_SMOB_OBJECT}), the other contains the 16-bit type tag and
- the 16 extra bits.
- In addition to the fundamental two-word cells, Guile also has
- four-word cells, which are appropriately called @dfn{double cells}.
- You can use them for @dfn{double smobs} and get two more immediate
- words of type @code{scm_t_bits}.
- A double smob is created with @code{SCM_NEWSMOB2} or
- @code{SCM_NEWSMOB3} instead of @code{SCM_NEWSMOB}. Its immediate
- words can be retrieved as @code{scm_t_bits} with
- @code{SCM_SMOB_DATA_2} and @code{SCM_SMOB_DATA_3} in addition to
- @code{SCM_SMOB_DATA}. Unsurprisingly, the words can be set to
- @code{scm_t_bits} values with @code{SCM_SET_SMOB_DATA_2} and
- @code{SCM_SET_SMOB_DATA_3}.
- Of course there are also @code{SCM_SMOB_OBJECT_2},
- @code{SCM_SMOB_OBJECT_3}, @code{SCM_SET_SMOB_OBJECT_2}, and
- @code{SCM_SET_SMOB_OBJECT_3}.
- @node The Complete Example
- @subsection The Complete Example
- Here is the complete text of the implementation of the image datatype,
- as presented in the sections above. We also provide a definition for
- the smob's @emph{print} function, and make some objects and functions
- static, to clarify exactly what the surrounding code is using.
- As mentioned above, you can find this code in the Guile distribution, in
- @file{doc/example-smob}. That directory includes a makefile and a
- suitable @code{main} function, so you can build a complete interactive
- Guile shell, extended with the datatypes described here.)
- @example
- /* file "image-type.c" */
- #include <stdlib.h>
- #include <libguile.h>
- static scm_t_bits image_tag;
- struct image @{
- int width, height;
- char *pixels;
- /* The name of this image */
- SCM name;
- /* A function to call when this image is
- modified, e.g., to update the screen,
- or SCM_BOOL_F if no action necessary */
- SCM update_func;
- @};
- static SCM
- make_image (SCM name, SCM s_width, SCM s_height)
- @{
- SCM smob;
- struct image *image;
- int width = scm_to_int (s_width);
- int height = scm_to_int (s_height);
- /* Step 1: Allocate the memory block.
- */
- image = (struct image *) scm_gc_malloc (sizeof (struct image), "image");
- /* Step 2: Initialize it with straight code.
- */
- image->width = width;
- image->height = height;
- image->pixels = NULL;
- image->name = SCM_BOOL_F;
- image->update_func = SCM_BOOL_F;
- /* Step 3: Create the smob.
- */
- SCM_NEWSMOB (smob, image_tag, image);
- /* Step 4: Finish the initialization.
- */
- image->name = name;
- image->pixels = scm_gc_malloc (width * height, "image pixels");
- return smob;
- @}
- SCM
- clear_image (SCM image_smob)
- @{
- int area;
- struct image *image;
- scm_assert_smob_type (image_tag, image_smob);
- image = (struct image *) SCM_SMOB_DATA (image_smob);
- area = image->width * image->height;
- memset (image->pixels, 0, area);
- /* Invoke the image's update function.
- */
- if (scm_is_true (image->update_func))
- scm_call_0 (image->update_func);
- scm_remember_upto_here_1 (image_smob);
- return SCM_UNSPECIFIED;
- @}
- static SCM
- mark_image (SCM image_smob)
- @{
- /* Mark the image's name and update function. */
- struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
- scm_gc_mark (image->name);
- return image->update_func;
- @}
- static size_t
- free_image (SCM image_smob)
- @{
- struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
- scm_gc_free (image->pixels, image->width * image->height, "image pixels");
- scm_gc_free (image, sizeof (struct image), "image");
- return 0;
- @}
- static int
- print_image (SCM image_smob, SCM port, scm_print_state *pstate)
- @{
- struct image *image = (struct image *) SCM_SMOB_DATA (image_smob);
- scm_puts ("#<image ", port);
- scm_display (image->name, port);
- scm_puts (">", port);
- /* non-zero means success */
- return 1;
- @}
- void
- init_image_type (void)
- @{
- image_tag = scm_make_smob_type ("image", sizeof (struct image));
- scm_set_smob_mark (image_tag, mark_image);
- scm_set_smob_free (image_tag, free_image);
- scm_set_smob_print (image_tag, print_image);
- scm_c_define_gsubr ("clear-image", 1, 0, 0, clear_image);
- scm_c_define_gsubr ("make-image", 3, 0, 0, make_image);
- @}
- @end example
- Here is a sample build and interaction with the code from the
- @file{example-smob} directory, on the author's machine:
- @example
- zwingli:example-smob$ make CC=gcc
- gcc `guile-config compile` -c image-type.c -o image-type.o
- gcc `guile-config compile` -c myguile.c -o myguile.o
- gcc image-type.o myguile.o `guile-config link` -o myguile
- zwingli:example-smob$ ./myguile
- guile> make-image
- #<primitive-procedure make-image>
- guile> (define i (make-image "Whistler's Mother" 100 100))
- guile> i
- #<image Whistler's Mother>
- guile> (clear-image i)
- guile> (clear-image 4)
- ERROR: In procedure clear-image in expression (clear-image 4):
- ERROR: Wrong type (expecting image): 4
- ABORT: (wrong-type-arg)
-
- Type "(backtrace)" to get more information.
- guile>
- @end example
|