123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205 |
- This file describes the jaxp (xml processing) implementation of GNU Classpath.
- GNU Classpath includes interfaces and implementations for basic XML processing
- in in the java programming language, some general purpose SAX2 utilities, and
- transformation.
- These classes used to be maintained as part of an external project GNU JAXP
- but are now integrated with the rest of the core class library provided by
- GNU Classpath.
- PACKAGES
-
- . javax.xml.* ... JAXP 1.3 interfaces
- . gnu.xml.aelfred2.* ... SAX2 parser + validator
- . gnu.xml.dom.* ... DOM Level 3 Core, Traversal, XPath implementation
- . gnu.xml.dom.ls.* ... DOM Level 3 Load & Save implementation
- . gnu.xml.xpath.* ... JAXP XPath implementation
- . gnu.xml.transform.* ... JAXP XSL transformer implementation
- . gnu.xml.pipeline.* ... SAX2 event pipeline support
- . gnu.xml.stream.* ... StAX pull parser and SAX-over-StAX driver
- . gnu.xml.util.* ... various XML utility classes
- . gnu.xml.libxmlj.dom.* ... libxmlj DOM Level 3 Core and XPath
- . gnu.xml.libxmlj.sax.* ... libxmlj SAX parser
- . gnu.xml.libxmlj.transform.* ... libxmlj XSL transformer
- . gnu.xml.libxmlj.util.* ... libxmlj utility classes
- In the external directory you can find the following packages.
- They are not maintained as part of GNU Classpath, but are used by the
- classes in the above packages.
- . org.xml.sax.* ... SAX2 interfaces
- . org.w3c.dom.* ... DOM Level 3 interfaces
- . org.relaxng.datatype.* ... RELAX NG pluggable datatypes API
- CONFORMANCE
- The primary test resources are at http://xmlconf.sourceforge.net
- and include:
- SAX2/XML conformance tests
- That the "xml.testing.Driver" addresses the core XML 1.0
- specification requirements, which closely correspond to the
- functionality SAX1 provides. The driver uses SAX2 APIs to
- test that functionality It is used with a bugfixed version of
- the NIST/OASIS XML conformance test cases.
-
- The AElfred2 parser is highly conformant, though it still takes
- a few implementation shortcuts. See its package documentation
- for information about known XML conformance issues in AElfred2.
- The primary issue is using Unicode character tables, rather than
- those in the XML specification, for determining what names are
- valid. Most applications won't notice the difference, and this
- solution is smaller and faster than the alternative.
- For validation, a secondary issue is that issues relating to
- entity modularity are not validated; they can't all be cleanly
- layered. For example, validity constraints related to standalone
- declarations and PE nesting are not checked.
- The current implementation has also been tested against Elliotte
- Rusty Harold's SAXTest test suite (http://www.cafeconleche.org/SAXTest)
- and achieves approximately 93% conformance to the SAX specification
- according to these tests, higher than any other current Java parser.
- SAX2
- SAX2 API conformance currently has a minimal JUNIT (0.2) test suite,
- which can be accessed at the xmlconf site listed above. It does
- not cover namespaces or LexicalHandler and Declhandler extensions
- anywhere as exhaustively as the SAX1 level functionality is
- tested by the "xml.testing.Driver". However:
- - Applying the DOM unit tests to this implementation gives
- the LexicalHandler (comments, and boundaries of DTDs,
- CDATA sections, and general entities) a workout, and
- does the same for DeclHandler entity declarations.
-
- - The pipeline package's layered validator demands that
- element and attribute declarations are reported correctly.
-
- By those metrics, SAX2 conformance for AElfred2 is also strong.
-
- DOM Level 3 Core Tests
- The DOM implementation has been tested against the W3C DOM Level 3
- Core conformance test suite (http://www.w3.org/DOM/Test/). Current
- conformance according to these tests is 72.3%. Many of the test
- failures are due to the fact that GNU JAXP does not currently
- provide any W3C XML Schema support.
- XSL transformation
- The transformer and XPath implementation have been tested against
- the OASIS XSLT and XPath TC test suite. Conformance against the
- Xalan tests is currently 77%.
- libxmlj
- ========================================================================
- libxmlj is an effort to create a 100% JAXP-compatible Java wrapper for
- libxml2 and libxslt. JAXP is the Java API for XML processing, libxml2
- is the XML C library for Gnome, and libxslt is the XSLT C library for
- Gnome.
- libxmlj currently supports most of the DOM Level 3 Core, Traversal, and
- XPath APIs, SAX2, and XSLT transformations. There is no W3C XML Schema
- support yet.
- libxmlj can parse and transform XML documents extremely quickly in
- comparison to Java-based JAXP implementations. DOM manipulations, however,
- involve JNI overhead, so the speed of DOM tree construction and traversal
- can be slower than the Java implementation.
- libxmlj is highly experimental, doesn't always conform to the DOM
- specification correctly, and may leak memory. Production use is not advised.
- The implementation can be found in gnu/xml/libxmlj and native/jni/xmlj.
- See the INSTALL file for the required versions of libxml2 and libxslt.
- configure --enable-xmlj will build it.
- Usage
- ------------------------------------------------------------------------
- To enable the various GNU JAXP factories, set the following system properties
- (command-line version shown, but they can equally be set programmatically):
- AElfred2:
- -Djavax.xml.parsers.SAXParserFactory=gnu.xml.aelfred2.JAXPFactory
- GNU DOM (using DOM Level 3 Load & Save):
- -Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.DomDocumentBuilderFactory
- GNU DOM (using AElfred-only pipeline classes):
- -Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.JAXPFactory
- GNU XSL transformer:
- -Djavax.xml.transform.TransformerFactory=gnu.xml.transform.TransformerFactoryImpl
- GNU StAX:
- -Djavax.xml.stream.XMLEventFactory=gnu.xml.stream.XMLEventFactoryImpl
- -Djavax.xml.stream.XMLInputFactory=gnu.xml.stream.XMLInputFactoryImpl
- -Djavax.xml.stream.XMLOutputFactory=gnu.xml.stream.XMLOutputFactoryImpl
- GNU SAX-over-StAX:
- -Djavax.xml.parsers.SAXParserFactory=gnu.xml.stream.SAXParserFactory
- libxmlj SAX:
- -Djavax.xml.parsers.SAXParserFactory=gnu.xml.libxmlj.sax.GnomeSAXParserFactory
- libxmlj DOM:
- -Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.libxmlj.dom.GnomeDocumentBuilderFactory
- libxmlj XSL transformer:
- -Djavax.xml.transform.TransformerFactory=gnu.xml.libxmlj.transform.GnomeTransformerFactory
- When using libxmlj, the libxmlj shared library must be available.
- In general it is picked up by the runtime using GNU Classpath. If not you
- might want to try adding the directory where libxmlj.so is installed
- (by default ${prefix}/lib/classpath/) with ldconfig or specifying in the
- LD_LIBRARY_PATH environment variable. Additionally, you may need to specify
- the location of your shared libraries to the runtime environment using the
- java.library.path system property.
- Missing (libxmlj) Features
- ------------------------------------------------------------------------
- See BUGS in native/jni/xmlj for known bugs in the libxmlj native bindings.
- This implementation should be thread-safe, but currently all
- transformation requests are queued via Java synchronization, which
- means that it effectively performs single-threaded. Long story short,
- both libxml2 and libxslt are not fully reentrant.
- Update: it may be possible to make libxmlj thread-safe nonetheless
- using thread context variables.
- Update: thread context variables have been introduced. This is very
- untested though, libxmlj therefore still has the single thread
- bottleneck.
- Validation
- ===================================================
- Pluggable datatypes
- ---------------------------------------------------
- Validators should use the RELAX NG pluggable datatypes API to retrieve
- datatype (XML Schema simple type) implementations in a schema-neutral
- fashion. The following code demonstrates looking up a W3C XML Schema
- nonNegativeInteger datatype:
- DatatypeLibrary xsd = DatatypeLibraryLoader
- .createDatatypeLibrary(XMLConstants.W3C_XML_SCHEMA_NS_URI);
- Datatype nonNegativeInteger = xsd.createDatatype("nonNegativeInteger");
- It is also possible to create new types by derivation. For instance,
- to create a datatype that will match a US ZIP code:
- DatatypeBuilder b = xsd.createDatatypeBuilder("string");
- b.addParameter("pattern", "(^[0-9]{5}$)|(^[0-9]{5}-[0-9]{4}$)");
- Datatype zipCode = b.createDatatype();
- A datatype library implementation for XML Schema is provided; other
- library implementations may be added.
|