design.txt 5.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160
  1. eZ Component: Document, Design
  2. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  3. Design description
  4. ==================
  5. ezcDocument interface
  6. ---------------------
  7. Interface that defines an abstract document class. Classes that implement
  8. this interface are called after the format's name, for example 'ezcDocumentHtml'.
  9. These classes are able to generate document objects from a certain type of data
  10. (text, DOM, XML, file) with static functions like 'createFromText()'. When the
  11. object is created, it's possible to get it's content in any type of data that
  12. is supported by this format using functions like 'getText()' or 'getXML()'.
  13. If a certain given type of data is not supported by this format, it will throw
  14. a error.
  15. ezcDocConverter interface
  16. -------------------------
  17. Interface that defines an abstract conversion class. Real conversion classes
  18. will implement conversion of the given document from one format to another.
  19. The names of that classes follow the pattern: ezcDocFormat1ToFormat2, where
  20. Format1 and Format2 are format names like 'Html' or 'Docbook", for example:
  21. 'ezcDocHtmlToDocbook' implements conversion from HTML to DocBook.
  22. The main function of this interface 'convert()' takes a document in one format
  23. and return it in another. Both argument and return value are objects of
  24. 'ezcDocument' implementation classes.
  25. ezcDocParser class
  26. ------------------
  27. Contains methods for text document parsing using a formal grammar. Exact formal
  28. grammars and format-specific callback functions (if needed) are set in
  29. format handling classes.
  30. ezcDocXSLTTransformer class
  31. ---------------------------
  32. A class based on ezcDocConverterBase for transforming DOM documents using special
  33. rules and element callback handlers. Contains only methods for document
  34. transformation. Exact rules and element handlers are set in derived classes.
  35. ezcDocOutput class
  36. ------------------
  37. Performs an output of the given document tree in the text format using simple
  38. internal templating system. Also it cares about text indenting to show the
  39. structure of the document. Exact templates for element output and helper
  40. formatting functions are set in derived classes.
  41. ezcDocOutputTemplate class
  42. --------------------------
  43. Implemented in the DocumentTemplateTieIn component. It extends ezcDocOutput
  44. class for using Template component for elements output.
  45. ezcDocValidator
  46. ---------------
  47. Validates a document or a separate element against it's schema. This class uses
  48. RelaxNG schema format as the input.
  49. Algorithms
  50. ==========
  51. Transforming XML
  52. ----------------
  53. XML documents are transformed using XSLT stylesheets and XSL extension for PHP.
  54. Transformations are done with ezcDocXSLTTransformer class.
  55. Parsing text/XML
  56. ----------------
  57. ezcDocParser class performs a parsing of the input text and presents
  58. it as a DOM tree.
  59. This is not an implementation of a real context-free parser.
  60. There is an assumption that input language is XML-like, i.e. consists
  61. of elements that have their opening and ending parts and some
  62. content between them (that may contain another elements).
  63. Sometimes it's hard or impossible to formalize input in these terms,
  64. so some special algorithms or custom element handlers will be used
  65. in this case.
  66. Document output
  67. ---------------
  68. ezcDocOutput class performs an output of the given document tree in the text
  69. format using simple internal templating system. Also it cares about text
  70. indenting to show the structure of the document.
  71. Exact templates for element output and helper formatting functions are set
  72. in derived classes. Templates are simple strings in which some character
  73. or sequence is replaced with another string using str_replace.
  74. ezcDocOutputTemplate class is implemented in the DocumentTemplateTieIn
  75. component. It extends this class to use Template component for elements
  76. output.
  77. Validating documents
  78. --------------------
  79. ezcDocValidator is used to validate a document or a separate element
  80. against it's schema.
  81. This class uses RelaxNG schema format as the input, then transforms it
  82. to the inner format for fast processing. The processed schema is stored
  83. in cached .php file for faster access in the future.
  84. The idea for fast validation is using regular expressions and strings.
  85. Here is an example:
  86. <element name="elem1">
  87. <zeroOrMore>
  88. <element name="elem2">
  89. ...
  90. </element>
  91. </zeroOrMore>
  92. <element name="elem3">
  93. ...
  94. </element>
  95. </element>
  96. This RelaxNG schema for the element's content can be presented with regexp:
  97. '#(elem2)*elem3#'
  98. Validated document element's children can be also presented with a string,
  99. like 'elem2elem2elem3' for instance, which is validated with this regexp.
  100. The similar process used for attributes.
  101. Examples
  102. ========
  103. Converting Format1 to Format2
  104. -----------------------------
  105. $docFormat1 = new ezcDocumentText( $text, 'format1' );
  106. $converter1 = new ezcDocFormat1ToInternal( $parameters1 );
  107. $docInternal = $converter1->convert( $docFormat1 );
  108. $converter2 = new ezcDocInternalToFormat2( $parameters2 );
  109. $docFormat2 = $converter2->convert( $docInternal );
  110. $result = $docFormat2->getText();
  111. /// The same with static functions:
  112. $docFormat1 = new ezcDocumentText( $text, 'format1' );
  113. $docInternal = ezcDocFormat1ToInternal::convert( $docFormat1, $parameters1 );
  114. $docFormat2 = ezcDocInternalToFormat2::convert( $docInternal, $parameters2 );
  115. $result = $docFormat2->getText();