123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160 |
- eZ Component: Document, Design
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Design description
- ==================
- ezcDocument interface
- ---------------------
- Interface that defines an abstract document class. Classes that implement
- this interface are called after the format's name, for example 'ezcDocumentHtml'.
- These classes are able to generate document objects from a certain type of data
- (text, DOM, XML, file) with static functions like 'createFromText()'. When the
- object is created, it's possible to get it's content in any type of data that
- is supported by this format using functions like 'getText()' or 'getXML()'.
- If a certain given type of data is not supported by this format, it will throw
- a error.
- ezcDocConverter interface
- -------------------------
- Interface that defines an abstract conversion class. Real conversion classes
- will implement conversion of the given document from one format to another.
- The names of that classes follow the pattern: ezcDocFormat1ToFormat2, where
- Format1 and Format2 are format names like 'Html' or 'Docbook", for example:
- 'ezcDocHtmlToDocbook' implements conversion from HTML to DocBook.
- The main function of this interface 'convert()' takes a document in one format
- and return it in another. Both argument and return value are objects of
- 'ezcDocument' implementation classes.
- ezcDocParser class
- ------------------
- Contains methods for text document parsing using a formal grammar. Exact formal
- grammars and format-specific callback functions (if needed) are set in
- format handling classes.
- ezcDocXSLTTransformer class
- ---------------------------
- A class based on ezcDocConverterBase for transforming DOM documents using special
- rules and element callback handlers. Contains only methods for document
- transformation. Exact rules and element handlers are set in derived classes.
- ezcDocOutput class
- ------------------
- Performs an output of the given document tree in the text format using simple
- internal templating system. Also it cares about text indenting to show the
- structure of the document. Exact templates for element output and helper
- formatting functions are set in derived classes.
- ezcDocOutputTemplate class
- --------------------------
- Implemented in the DocumentTemplateTieIn component. It extends ezcDocOutput
- class for using Template component for elements output.
- ezcDocValidator
- ---------------
- Validates a document or a separate element against it's schema. This class uses
- RelaxNG schema format as the input.
- Algorithms
- ==========
- Transforming XML
- ----------------
- XML documents are transformed using XSLT stylesheets and XSL extension for PHP.
- Transformations are done with ezcDocXSLTTransformer class.
- Parsing text/XML
- ----------------
- ezcDocParser class performs a parsing of the input text and presents
- it as a DOM tree.
- This is not an implementation of a real context-free parser.
- There is an assumption that input language is XML-like, i.e. consists
- of elements that have their opening and ending parts and some
- content between them (that may contain another elements).
-
- Sometimes it's hard or impossible to formalize input in these terms,
- so some special algorithms or custom element handlers will be used
- in this case.
- Document output
- ---------------
- ezcDocOutput class performs an output of the given document tree in the text
- format using simple internal templating system. Also it cares about text
- indenting to show the structure of the document.
-
- Exact templates for element output and helper formatting functions are set
- in derived classes. Templates are simple strings in which some character
- or sequence is replaced with another string using str_replace.
- ezcDocOutputTemplate class is implemented in the DocumentTemplateTieIn
- component. It extends this class to use Template component for elements
- output.
- Validating documents
- --------------------
- ezcDocValidator is used to validate a document or a separate element
- against it's schema.
- This class uses RelaxNG schema format as the input, then transforms it
- to the inner format for fast processing. The processed schema is stored
- in cached .php file for faster access in the future.
- The idea for fast validation is using regular expressions and strings.
- Here is an example:
-
- <element name="elem1">
- <zeroOrMore>
- <element name="elem2">
- ...
- </element>
- </zeroOrMore>
- <element name="elem3">
- ...
- </element>
- </element>
- This RelaxNG schema for the element's content can be presented with regexp:
- '#(elem2)*elem3#'
- Validated document element's children can be also presented with a string,
- like 'elem2elem2elem3' for instance, which is validated with this regexp.
-
- The similar process used for attributes.
- Examples
- ========
- Converting Format1 to Format2
- -----------------------------
- $docFormat1 = new ezcDocumentText( $text, 'format1' );
- $converter1 = new ezcDocFormat1ToInternal( $parameters1 );
- $docInternal = $converter1->convert( $docFormat1 );
- $converter2 = new ezcDocInternalToFormat2( $parameters2 );
- $docFormat2 = $converter2->convert( $docInternal );
- $result = $docFormat2->getText();
- /// The same with static functions:
- $docFormat1 = new ezcDocumentText( $text, 'format1' );
- $docInternal = ezcDocFormat1ToInternal::convert( $docFormat1, $parameters1 );
- $docFormat2 = ezcDocInternalToFormat2::convert( $docInternal, $parameters2 );
- $result = $docFormat2->getText();
|