Christopher Hall 59653bd883 [offline-renderer] initial version 15 lat temu
..
CBTCompiler.php 59653bd883 [offline-renderer] initial version 15 lat temu
CBTProcessor.php 59653bd883 [offline-renderer] initial version 15 lat temu
README 59653bd883 [offline-renderer] initial version 15 lat temu

README

Overview
--------

CBT (callback-based templates) is an experimental system for improving skin
rendering time in MediaWiki and similar applications. The fundamental concept is
a template language which contains tags which pull text from PHP callbacks.
These PHP callbacks do not simply return text, they also return a description of
the dependencies -- the global data upon which the returned text depends. This
allows a compiler to produce a template optimised for a certain context. For
example, a user-dependent template can be produced, with the username replaced
by static text, as well as all user preference dependent text.

This was an experimental project to prove the concept -- to explore possible
efficiency gains and techniques. TemplateProcessor was the first element of this
experiment. It is a class written in PHP which parses a template, and produces
either an optimised template with dependencies removed, or the output text
itself. I found that even with a heavily optimised template, this processor was
not fast enough to match the speed of the original MonoBook.

To improve the efficiency, I wrote TemplateCompiler, which takes a template,
preferably pre-optimised by TemplateProcessor, and generates PHP code from it.
The generated code is a single expression, concatenating static text and
callback results. This approach turned out to be very efficient, making
significant time savings compared to the original MonoBook.

Despite this success, the code has been shelved for the time being. There were
a number of unresolved implementation problems, and I felt that there were more
pressing priorities for MediaWiki development than solving them and bringing
this module to completion. I also believe that more research is needed into
other possible template architectures. There is nothing fundamentally wrong with
the CBT concept, and I would encourage others to continue its development.

The problems I saw were:

* Extensibility. Can non-Wikimedia installations easily extend and modify CBT
skins? Patching seems to be necessary, is this acceptable? MediaWiki
extensions are another problem. Unless the interfaces allow them to return
dependencies, any hooks will have to be marked dynamic and thus inefficient.

* Cache invalidation. This is a simple implementation issue, although it would
require extensive modification to the MediaWiki core.

* Syntax. The syntax is minimalistic and easy to parse, but can be quite ugly.
Will generations of MediaWiki users curse my name?

* Security. The code produced by TemplateCompiler is best stored in memcached
and executed with eval(). This allows anyone with access to the memcached port
to run code as the apache user.


Template syntax
---------------

There are two modes: text mode and function mode. The brace characters "{"
and "}" are the only reserved characters. Either one of them will switch from
text mode to function mode wherever they appear, no exceptions.

In text mode, all characters are passed through to the output. In function
mode, text is split into tokens, delimited either by whitespace or by
matching pairs of braces. The first token is taken to be a function name. The
other tokens are first processed in function mode themselves, then they are
passed to the named function as parameters. The return value of the function
is passed through to the output.

Example:
{escape {"hello"}}

First brace switches to function mode. The function name is escape, the first
and only parameter is {"hello"}. This parameter is executed. The braces around
the parameter cause the parser to switch to text mode, thus the string "hello",
including the quotes, is passed back and used as an argument to the escape
function.

Example:
{if title {

{title}

}}

The function name is "if". The first parameter is the result of calling the
function "title". The second parameter is a level 1 HTML heading containing
the result of the function "title". "if" is a built-in function which will
return the second parameter only if the first is non-blank, so the effect of
this is to return a heading element only if a title exists.

As a shortcut for generation of HTML attributes, if a function mode segment is
surrounded by double quotes, quote characters in the return value will be
escaped. This only applies if the quote character immediately precedes the
opening brace, and immediately follows the closing brace, with no whitespace.

User callback functions are defined by passing a function object to the
template processor. Function names appearing in the text are first checked
against built-in function names, then against the method names in the function
object. The function object forms a sandbox for execution of the template, so
security-conscious users may wish to avoid including functions that allow
arbitrary filesystem access or code execution.

The callback function will receive its parameters as strings. If the
result of the function depends only on the arguments, and certain things
understood to be "static", such as the source code, then the callback function
should return a string. If the result depends on other things, then the function
should call cbt_value() to get a return value:

return cbt_value( $text, $deps );

where $deps is an array of string tokens, each one naming a dependency. As a
shortcut, if there is only one dependency, $deps may be a string.


---------------------
Tim Starling 2006