123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909 |
- ========================
- eZ Components - Document
- ========================
- .. contents:: Table of Contents
- :depth: 3
- Introduction
- ============
- The document component offers transformations between different semantic markup
- languages, like:
- - `ReStructured text`__
- - `XHTML`__
- - `Docbook`__
- - `eZ Publish XML markup`__
- - Wiki markup languages, like: Creole__, Dokuwiki__ and Confluence__
- - `Open Document Text`__ as used by `OpenOffice.org`__ and other office suites
- Like shown in figure 1, each format supports conversions from and to docbook
- as a central intermediate format and may implement additional shortcuts for
- conversions from and to other formats. Not each format can express the same
- semantics, so there may be some information lost, which is `documented in a
- dedicated document`__.
- .. figure:: img/document-architecture.png
- :alt: Conversion architecture in document component
- Figure 1: Conversion architecture in document component
- There are central handler classes for each markup language, which follow a
- common conversion interface ezcDocument and all implement the methods
- getAsDocbook() and createFromDocbook().
- Additionally the document component can render documents in the following
- output formats. Those formats cannot be read, but just generated:
- - PDF
- __ http://docutils.sourceforge.net/rst.html
- __ http://www.w3.org/TR/xhtml1/
- __ http://www.docbook.org/
- __ Document_conversion.html
- __ http://doc.ez.no/eZ-Publish/Technical-manual/4.x/Reference/XML-tags
- __ http://www.wikicreole.org/
- __ http://www.dokuwiki.org/dokuwiki
- __ http://confluence.atlassian.com/renderer/notationhelp.action?section=all
- __ http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=office
- __ http://www.openoffice.org/
- Markup languages
- ================
- The following markup languages are currently handled by the document
- component.
- ReStructured text
- -----------------
- `RsStructured Text`__ (RST) is a simple text based markup language, intended
- to be easy to read and write by humans. Examples can be found in the
- `documentation of RST`__.
- The transformation of a simple RST document to docbook can be done just like
- this:
- .. include:: tutorial/00_00_convert_rst.php
- :literal:
- In line 3 the document is actually loaded and parsed into an internal abstract
- syntax tree. In line 5 the internal structure is then transformed back to a
- docbook document. In the last line the resulting document is returned as a
- string, so that you can echo or store it.
- __ http://docutils.sourceforge.net/rst.html
- __ http://docutils.sourceforge.net/docs/user/rst/quickstart.html
- Error handling
- ^^^^^^^^^^^^^^
- By default each parsing or compiling error will be transformed into an
- exception, so that you are noticed about those errors. The error reporting
- settings can be modified like for all other document handlers::
- <?php
- $document = new ezcDocumentRst();
- $document->options->errorReporting = E_PARSE | E_ERROR | E_WARNING;
- $document->loadFile( '../tutorial.txt' );
- $docbook = $document->getAsDocbook();
- echo $docbook->save();
- ?>
- Where the setting in line 3 causes, that only warnings, errors and fatal errors
- are transformed to exceptions now, while the notices are only collected, but
- ignored. This setting affects both, the parsing of the source document and the
- compiling into the destination language.
- Directives
- ^^^^^^^^^^
- `RST directives`__ are elements in the RST documents with parameters, optional
- named options and optional content. The document component implements a well
- known subset of the `directives implemented in the docutils RST parser`__. You
- may register custom directive handlers, or overwrite existing directive
- handlers using your own implementation. A directive in RST markup with
- parameters, options and content could look like::
- My document
- ===========
- The custom directive:
- .. my_directive:: parameters
- :option: value
- Some indented text...
- For such a directive you should register a handler on the RST document, like::
- <?php
- $document = new ezcDocumentRst();
- $document->registerDirective( 'my_directive', 'myCustomDirective' );
- $document->loadFile( $from );
- $docbook = $document->getAsDocbook();
- $xml = $docbook->save();
- ?>
- The class myCustomDirective must extend the class ezcDocumentRstDirective, and
- implement the method toDocbook(). For rendering you get access to the full AST,
- the contents of the current directive and the base path, where the document
- resist in the file system - which is necessary for accessing external files.
- __ http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html#directives
- __ http://docutils.sourceforge.net/docs/ref/rst/directives.html
- Directive example
- `````````````````
- A full example for a custom directive, where we want to embed real world
- addresses into our RST document and maintain the semantics in the resulting
- docbook, could look like::
- Address example
- ===============
- .. address:: John Doe
- :street: Some Lane 42
- We would possibly add more information, like the ZIP code, city and state, but
- skip this to keep the code short. The implemented directive then would just
- need to take these information and transform it into valid docbook XML using
- the DOM extension.
- .. include:: tutorial/00_01_address_directive.php
- :literal:
- The AST node, which should be rendered, is passed to the constructor of the
- custom directive visitor and available in the class property $node. The
- complete DOMDocument and the current DOMNode are passed to the method. In this
- case we just create a `address node`__ with the optional child nodes street and
- personname, depending on the existence of the respective values.
- You can now render the RST document after you registered you custom directive
- handler as shown above:
- .. include:: tutorial/00_02_custom_directive.php
- :literal:
- The output will then look like::
- <?xml version="1.0"?>
- <article xmlns="http://docbook.org/ns/docbook">
- <section id="address_example">
- <sectioninfo/>
- <title>Address example</title>
- <address>
- <personname> John Doe</personname>
- <street> Some Lane 42</street>
- </address>
- </section>
- </article>
- __ http://docbook.org/tdg/en/html/address.html
- XHTML rendering
- ^^^^^^^^^^^^^^^
- For RST a conversion shortcut has been implemented, so that you don't need to
- convert the RST to docbook and the docbook to XHTML. This saves conversion time
- and enables you to prevent from information loss during multiple conversions::
- <?php
- $document = new ezcDocumentRst();
- $document->loadFile( $from );
- $xhtml = $document->getAsXhtml();
- $xml = $xhtml->save();
- ?>
- The default XHTML compiler generates complete XHTML documents, including header
- and meta-data in the header. If you want to in-line the result, you may specify
- another XHTML compiler, which just creates a XHTML block level element, which
- can be embedded in your source code::
- <?php
- $document = new ezcDocumentRst();
- $document->options->xhtmlVisitor = 'ezcDocumentRstXhtmlBodyVisitor';
- $document->loadFile( $from );
- $xhtml = $document->getAsXhtml();
- $xml = $xhtml->save();
- ?>
- You can of course also use the predefined and custom directives for XHTML
- rendering. The directives used during XHTML generation also need to implement
- the interface ezcDocumentRstXhtmlDirective.
- Modification of XHTML rendering
- ```````````````````````````````
- You can modify the generated output of the XHTML visitor by creating a custom
- visitor for the RST AST. The easiest way probably is to extend from one of the
- existing XHTML visitors and reusing it. For example you may want to fill the
- type attribute in bullet lists, like known from HTML, which isn't valid XHTML,
- though::
- class myDocumentRstXhtmlVisitor extends ezcDocumentRstXhtmlVisitor
- {
- protected function visitBulletList( DOMNode $root, ezcDocumentRstNode $node )
- {
- $list = $this->document->createElement( 'ul' );
- $root->appendChild( $list );
- $listTypes = array(
- '*' => 'circle',
- '+' => 'disc',
- '-' => 'square',
- "\xe2\x80\xa2" => 'disc',
- "\xe2\x80\xa3" => 'circle',
- "\xe2\x81\x83" => 'square',
- );
- // Not allowed in XHTML strict
- $list->setAttribute( 'type', $listTypes[$node->token->content] );
- // Decoratre blockquote contents
- foreach ( $node->nodes as $child )
- {
- $this->visitNode( $list, $child );
- }
- }
- }
- The structure, which is not enforced for visitors, but used in the docbook and
- XHTML visitors, is to call special methods for each node type in the AST to
- decorate the AST recursively. This method will be called for all bullet list
- nodes in the AST which contain the actual list items. As the first parameter
- the current position in the XHTML DOM tree is also provided to the method.
- To create the XHTML we can now just create a new list node (<ul>) in the
- current DOMNode, set the new attribute, and recursively decorate all
- descendants using the general visitor dispatching method visitNode() for all
- children in the AST. For the AST children being also rendered as children in
- the XML tree, we pass the just created DOMNode (<ul>) as the new root node to
- the visitNode() method.
- After defining such a class, you could use the custom visitor like shown
- above::
- <?php
- $document = new ezcDocumentRst();
- $document->options->xhtmlVisitor = 'myDocumentRstXhtmlVisitor';
- $document->loadFile( $from );
- $xhtml = $document->getAsXhtml();
- $xml = $xhtml->save();
- ?>
- Now the lists in the generated XHTML will also the type attribute set.
- Writing RST
- ^^^^^^^^^^^
- Writing a RST document from an existing docbook document, or a
- ezcDocumentDocbook object generated from some other source, is trivial:
- .. include:: tutorial/00_03_write_rst.php
- :literal:
- For the conversion internally the ezcDocumentDocbookToRstConverter class is
- used, which can also be called directly, like::
- $converter = new ezcDocumentDocbookToRstConverter();
- $rst = $converter->convert( $docbook );
- Using this you can configure the converter to your wishes, or extend the
- convert to handle yet unhandled docbook elements. The converter is, as usaul
- configured using its option property, and the options are defined in the
- ezcDocumentDocbookToRstConverterOptions class. There you may configure the
- header underlines used, the bullet types or the line wrapping.
- Extending RST writing
- `````````````````````
- As said before, not all existing docbook elements might already be handled by
- the converter. But its handler based mechanism makes it easy to extend or
- overwrite existing behaviour.
- Similar to the example above we can convert the <address> docbook element back
- to the address RST directive.
- .. include:: tutorial/00_04_address_element.php
- :literal:
- The handler classes are assigned to XML elements in some namespace, "docbook"
- in this case. It is registered in line 18 for the element "address". The class
- itself has to extend from the ezcDocumentElementVisitorHandler class, which is
- in this case already extended by ezcDocumentDocbookToRstBaseHandler, which
- provides some convenience methods for RST creation, like renderDirective() used
- in this example.
- The handler is called, whenever the element, it has been registered for, occurs
- in the docbook XML tree. In this case it has to append the generated RST part
- for this element to the RST document - and may call the general conversion
- handler again for its child elements. This example converts the above shown
- docbook XML back to::
- .. _address_example:
- ===============
- Address example
- ===============
- .. address::
- John Doe
- Some Lane 42
- Which ignores any special address sub elements for the simplicity of the
- example. For more examples on element handlers check the existing
- implementations.
- XHTML
- -----
- Converting XHTML or HTML to a document markup language is a non trivial task,
- because XHTML elements are often used for layout, ignoring the actual semantics
- of the element. Therefore the document component allows to stack a set of
- filters, which each performs a specific conversion task. The default filter
- stack may work fine, but you may want to also implement custom filters
- depending on the contents of the filtered website, or to cover additional
- sources of meta data information, like RDF, Microformats or similar.
- The available filters are:
- - ezcDocumentXhtmlElementFilter
- This filter just maintains the common semantics of XHTML elements by
- converting them to their docbook equivalents. It ignores common class names.
- This filter is the most basic and you probably want to always add this one to
- the filter stack.
- - ezcDocumentXhtmlXpathFilter
- The XPath filter takes a XPath expression to locate the root of the document
- contents. It makes no sense to use this one together with the content locator
- filter. This is a more static, but also more precise way to tell the
- converter where to find the actual contents.
- - ezcDocumentXhtmlMetadataFilter
-
- This filter extracts common meta data from the XHTML head, and converts it
- into docbook section info elements.
- - ezcDocumentXhtmlTablesFilter
- HTML tables are especially often used for layout markup. This filter takes a
- threshold, and if the table text factor drops below this threshold the table
- is ignored. The same is true for stacked tables.
- - ezcDocumentXhtmlContentLocatorFilter
- The content locator filter tries to find the actual article in the markup of
- a website, ignoring the surrounding layout markup. This seems to work well
- for example for common news sites.
- By default just the element and meta data filters are used. So the conversion
- of a common website, like the `introduction article`__ from ezcomponents.org,
- results in a docbook document containing all lists for the navigation, etc..
- .. include:: tutorial/01_00_read_html.php
- :literal:
- So let's additionally use the XPath filter to pass the location of the actual
- content to the conversion:
- .. include:: tutorial/01_01_read_html_filtered.php
- :literal:
- With this additional filter, the contents are correctly found and converted
- properly.
- __ http://ezcomponents.org/introduction
- Writing XHTML
- ^^^^^^^^^^^^^
- Writing XHTML from docbook is very similar to the approach used for writing
- RST: It the same handler based mechanism, so you may want to check that chapter
- to learn how to extend it for unhandled docbook elements.
- .. include:: tutorial/01_02_write_html.php
- :literal:
- As you can see, it happens the same way, as for other conversion from Docbook
- to any other format.
- HTML styles
- ^^^^^^^^^^^
- By default inline CSS is embedded in all generated HTML, to create a more
- appealing default experience. This may of course be deactivated and you may
- also reference custom style sheets to be included in the generated HTML.
- .. include:: tutorial/01_03_write_html_styled.php
- :literal:
- For this we again use the converted directly to be able to configure it as we
- like.
- eZ Xml
- ------
- eZ XML describes the markup format used internally by `eZ Publish`__ for
- storing markup in content objects. The format is roughly specified in the `eZ
- Publish documentation`__.
- Modules are often register custom elements, which are not specified anywhere,
- so there might be several elements not handled by default.
- __ http://ez.no/ezpublish
- __ http://ez.no/doc/ez_publish/technical_manual/4_0/reference/xml_tags
- Reading eZ XML
- ^^^^^^^^^^^^^^
- Reading eZ XML is basically the same as for all other formats:
- .. include:: tutorial/02_00_read_ezxml.php
- :literal:
- As always the document object is either constructed from an input string or
- file. To convert into docbook you may just use the method getAsDocbook().
- Link handling
- `````````````
- Inside eZ XML documents link URIs are replaced with IDs, which reference the
- links inside the eZ Publish database, to ensure that a changed link is update
- globally. The replacing of such links is handled by a class extending from
- ezcDocumentEzXmlLinkProvider. By default dummy URLs are added to the documents.
- URLs are either referenced directly by their ID, a node ID, or an object ID.
- Those parameters are passed to the link provide, which then should return an
- URL for that.
- .. include:: tutorial/02_01_link_provider.php
- :literal:
- The link provider is only implemented as a trivial stub, but you can establish
- a database connection there and actually fetch the required data. I this case
- the generated docbook document look like::
- <?xml version="1.0"?>
- <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
- <article xmlns="http://docbook.org/ns/docbook">
- <section>
- <title>Paragraph</title>
- <para>Some content, with a <ulink url="http://host/path/1">link</ulink>.</para>
- </section>
- </article>
- The link provider is set again as a option of the converter. Like shown for the
- docbook conversions of the other handlers, you can register element handlers
- for yet unhandled eZ XML elements on the converter, too.
- Wrting eZ XML
- ^^^^^^^^^^^^^
- Writing eZ XML works nearly the same as reading. It again uses a XML based
- element handled, like shown in the Docbook to RST conversion in more detail.
- For the link conversion an object extending from ezcDocumentEzXmlLinkConverter
- is used, which returns an array with the attributes of the link in the eZ XML
- document.
- Wiki markup
- -----------
- Wiki markup has no central standard, but is used as a term to describe some
- common subset with lots of different extensions. Most wiki markup languages
- only support a quite trivial markup with severe limitations on the recursion of
- markup blocks. For example no markup really tables containing lists, or
- especially not tables containing other tables.
- The document component implements a generic parser to support multiple wiki
- markup languages. For each different markup syntax a tokenizer has to be
- implemented, which converts the implemented markup into a unified token stream,
- which can then be handled by the generic parser.
- The document component currently supports reading three wiki markup languages,
- but new ones are added easily by implementing another tokenizer. Supported are:
- - Creole__, developed by a initiative with the intention to create a unified
- wiki markup standard. This is the default wiki language, and currently the
- only one which can be written.
- Creole currently only supports a very limited set of markup__, all further
- markup additions are still up to discussion.
- - Dokuwiki__ is a popular wiki system, for example used on `wiki.php.net`__
- with a quite different syntax, and the most complete markup support, even
- including something like footnotes.
- - Confluence__ is a common Java based wiki with an entirely different and most
- uncommon syntax, which has mainly been implemented to prove the generic
- nature of the parser.
- All markup languages are tested against all examples from the respective
- markup language documentation, there might still be cases where the parsers of
- the default implementation behaves slightly different from the implementation
- in the document component.
- __ http://www.wikicreole.org/
- __ http://www.wikicreole.org/wiki/Elements
- __ http://www.dokuwiki.org/dokuwiki
- __ http://wiki.php.net/
- __ http://confluence.atlassian.com/renderer/notationhelp.action?section=all
- Reading wiki markup
- ^^^^^^^^^^^^^^^^^^^
- Reading wiki texts basically works like for any other markup language:
- .. include:: tutorial/03_00_read_wiki.php
- :literal:
- As said, by default the Creoletokenizer is used. The same result can be
- produced with dokuwiki markup and switching the tokenizer:
- .. include:: tutorial/03_01_read_wiki_confluence.php
- :literal:
- Writing wiki markup
- ^^^^^^^^^^^^^^^^^^^
- Until now only writing of creole wiki markup is supported. Since creole does
- not support a lot of the markup available in docbook, not all documents might
- get converted properly. Because it does not even support explicit internal
- references, we cannot even simulate footnotes like in HTML.
- If you want to add support for such conversions, it works exactly like the
- docbook RST conversion and can be extended the same way.
- .. include:: tutorial/03_02_write_wiki.php
- :literal:
- PDF
- ---
- PDF (Portable Document Format) has been developed to provide a document
- format, which can be presented software and system independent. Because of
- this it is often used as a pre-print document exchange format.
- The document componen can generate PDF document from all other input formats
- and offers a language very similar to CSS to apply custom styling to the
- generated output. Additionally it supports adding custom parts, like footers
- and headers, to the PDF document.
- Reading PDF
- ^^^^^^^^^^^
- The document component for now does not support reading PDF documents.
- Writing PDF
- ^^^^^^^^^^^
- Writing PDF basically works like writing any other format supported by the
- document component, like the basic example shows:
- .. include:: tutorial/04_01_create_pdf.php
- :literal:
- First we include some RST file to create a Docbook file from it, because, like
- described before, Docbook is the central conversion format.
- Afterwards the Docbook document is loaded by the PDF class and saved. When
- converting the document to a string the PDF is renderer using the default
- options and the default driver. The result of this rendering call can be
- watched here: `04_01_create_pdf.pdf`__.
- __ 04_01_create_pdf.pdf
- Output writers
- ``````````````
- Since there are numerous different PDF renderers in the PHP world and the
- available ones might depend on the current environment, the document component
- supports different PDF driver, as wrapper around different existent libraries.
- For now two implementation exist for pecl/haru and TCPDF, but it is fairly easy
- to write another one, for another PDF class.
- Haru
- """"
- libharu__ is a open source PDF generation library, written in C, and wrapped
- by the haru PHP extension, available from PECL__. If PEAR is correctly setup
- on your machine it should install as easy as::
- pear install pecl/haru
- The Haru driver is pretty fast, but currently has issues with some special
- characters. It is the default driver, but can be explicitly used by setting
- the driver option on the PDF class, like::
- $pdf = new ezcDocumentPdf();
- $pdf->options->driver = new ezcDocumentPdfHaruDriver();
- __ http://libharu.org
- __ http://pecl.php.net/package/haru
- TCPDF
- """""
- TCPDF is a pure PHP based PDF generation library, available from
- `tcpdf.org`__. To use the TCPDF driver you need to download and include its
- main class before rendering the PDF. It supports all aspects of PDF rendering
- required by the document component, but has some bad coding practices, like:
- - Throws lots of warnings and notices, which you might want to silence by
- temporarily changing the error reporting level
- - Reads and writes several global variables, which might or might not
- interfere with your application code
- - Uses eval() in several places, which results in non-cacheable OP-Codes.
- The TCPDF driver can be used after including the TCPDF source code, using::
- $pdf = new ezcDocumentPdf();
- $pdf->options->driver = new ezcDocumentPdfTcpdfDriver();
- __ http://tcpdf.org
- Styling the PDF
- ```````````````
- The PDF output can be styled using a CSS like language, which assigns styles
- based on the Docbook XML structure. The default styling rules are defined in
- the `default.css`__.
- __ https://svn.apache.org/repos/asf/incubator/zetacomponents/trunk/Document/src/pcss/style/default.css
- The first most relevant part are the general layout options, which can be
- defined for the common article root node in the Docbook XML file. You can set
- global font options there, like::
- article {
- // Basic font style definitions
- font-size: "12pt";
- font-family: "serif";
- font-weight: "normal";
- font-style: "normal";
- line-height: "1.4";
- text-align: "left";
- // Basic page layout definitions
- text-columns: "1";
- text-column-spacing: "10mm";
- // General text layout options
- orphans: "3";
- widows: "3";
- }
- The meaning of the first set of options should be obvious from CSS. We require
- each value to be wrapped by quotes for easier parsing, though.
- The second set of options defines options for multi-column layouts, which are
- not available in the web, but quite common in generated PDF documents. You can
- specify the number of text columns, as well as the distance between the text
- columns here.
- The third set in this example defines lesser known text layout options like
- the handling of `orphans and widows`__, which specify the handling of
- overlapping parts of paragraphs on page wrapping.
- You can, of course, apply those styles to any elements in your document, using
- the common CSS addressing rules, like::
- // Emphasis node anywhere in the document
- emphasis { ... }
- // Title element directly below a section element
- section > title { ... }
- // Title element anywhere below a section element
- section title { ... }
- // Title element with the ID "first_title"
- title#first_title { ... }
- // Title element with the class "foo"
- title.foo { ... }
- // emphasis node directly below a title with class "foo", anywhere in a
- // section with the ID "first"
- section#first title.foo > emphasis { ... }
- The values and `measures`__ for the properties are very similar to the
- properties in CSS. For example the margin and padding properties accept one-
- to four-tuples of values, with the same respective meaning like in CSS.
- Another central formatting element, which is special to the PDF generation, is
- the virtual element "page"::
- page {
- page-size: "A4";
- page-orientation: "portrait";
- padding: "22mm 16mm";
- }
- The page-size property accepts several known page size identifiers and the
- page-orientation defines the orientation of a page. You can also address any
- page directly by its ID, which will be 'page_1' for the first page, or its
- class, which will be "right", or "left", depending on the current page number.
- A detailed description of all available `PDF style options`__ is available
- here__.
- __ http://en.wikipedia.org/wiki/Widows_and_orphans
- __ measures
- __ Document_styles.html
- __ Document_styles.html
- Measures
- """"""""
- The properties in the PDF component accept different measures, which are:
- - "mm", Millimeters, the default measure, if none is specified
- - "pt", Points, 72 points per inch
- - "px", Pixel, depends on the set resolution, by default also 72 points per
- inch
- - "in", Inch
- The unit "Points" is most common for font sizes, while millimeters or inches
- will probably more useful for page paddings. You are free to choose any of
- them and can even combine different units in one tuple, like::
- para {
- // Top margin: 12 mm; Right margin: .1 inch; Bottom margin: 10 points,
- // Left margin: 1 pixel
- margin: "12 .1in 10pt 1px";
- }
- PDF parts
- `````````
- PDF parts are additional parts in a rendered document, like headers and
- footers. You can implement and register them yourself, and they are activated
- by different triggers, like:
- - on document creation
- - on page creation
- - when a document has been finished
- The default implementation for headers and footers is triggered on page
- creation and renders the title of the document, its author and a page number
- in the header or the footer. To develop a custom PDF part you should extend
- from the ezcDocumentPdfPart class.
- For the following document we are using a set of custom styles, as well as a
- header and a footer to customize the rendered PDF document. The additional
- custom CSS changes the default font and the page border:
- .. include:: tutorial/custom.css
- :literal:
- The code using the custom CSS and headers and footers then looks like:
- .. include:: tutorial/04_02_create_pdf_styled.php
- :literal:
- The first part, the creation of a Docbook document from a RST document is just
- the same like in the first example.
- Afterwards we load the above mentioned custom.css as an additional style. You
- can load as many styles as you want. If multiple styles are loaded, the latter
- ones always (partly) redefine the first styles.
- After that two custom PDF parts are registered using their respective option
- class to configure their skin. The footer should only show the page number,
- while the header should display all parts (title and author), but the page
- number.
- At the end of the example the document is created as usual, and looks like
- this: `04_02_create_pdf_styled.pdf`__ Since the source document does not
- include any author information, this information is also not rendered in the
- header.
- __ 04_02_create_pdf_styled.pdf
- Hyphenating
- ```````````
- Proper hyphenation is crucial for nice text rendering especially for justified
- paragraph formatting. Since hyphenation is highly language dependent you can
- create and use your own custom hyphenator - the default one doesn't do any
- hyphenation by default, but just keeps every word as it is.
- Custom hyphenators can be implemented by extending from the abstract class
- ezcDocumentPdfHyphenator. The only need to implement one Method,
- ```splitWord()```, which should return possible splitting points of the given
- word, as documented in the ezcDocumentPdfHyphenator class.
- The custom hyphenator can be configured in the ezcDocumentPdfOptions class,
- like this::
- $pdf = new ezcDocumentPdf();
- $pdf->options->hyphenator = new myHyphenator();
- The hyphenator will then be used by all text renderers during the rendering
- process.
- Open Document Text
- ------------------
- The Open Document Text (ODT) format is natively provided by the
- `OpenOffice.org`__ office application suite and supported by other common word
- processing tools. The Document component supports importing, exporting and
- styling of ODT files.
- .. note:: By now only im- and export of flat ODT (.fodt) files is possible.
- These can be processed by OpenOffice.org natively. To store FODT,
- simply choose the file type from the save dialog.
- Reading ODT
- ^^^^^^^^^^^
- The ODT document class reads FODT files and converts them into the internal
- Docbook representation of the Document component:
- .. include:: tutorial/05_00_read_fodt.php
- :literal:
- You can generate any of the supported document formats from the Docbook
- representation.
- FODT files may contain embedded media files, i.e. usually images, which will be
- extracted during the import process. You can specify the directory where these images will
- be stored through the ```imageDir``` option::
- <?php
- $odt->options->imageDir = '/path/to/your/images';
- ?>
- The default is your systems temporary directory.
- Since Open Document only contains few semantical information compared to
- Docbook, the import mechanism performs heuristic detection of information like
- emphasized text. This mechanism is quite rudimentary by now and will be made
- available as a public API as it matured.
- Writing ODT
- ^^^^^^^^^^^^^
- FODT files can be written similar to any of the other formats supported by the
- Document component:
- .. include:: tutorial/05_01_write_fodt.php
- :literal:
- Styling ODT
- ^^^^^^^^^^^
- FODT output can be styled using a CSS like language similar to `Styling the
- PDF`_. Using simplified CSS you assign style rules to Docbook XML elements,
- which are generated into automatic styles in the resulting Open Document. The
- default styling rules (`default.css`__) are the same as for PDF.
- __ https://svn.apache.org/repos/asf/incubator/zetacomponents/trunk/Document/src/pcss/style/default.css
- Applying custom styles can be done as follows:
- .. include:: tutorial/05_02_write_fodt_styled.php
- :literal:
- A detailed description of the available `style options` is available `here`__.
- __ Document_styles.html
- __ Document_styles.html
- ..
- Local Variables:
- mode: rst
- fill-column: 79
- End:
- vim: et syn=rst tw=79
|