nxml-mode.texi 33 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917
  1. \input texinfo @c -*- texinfo -*-
  2. @c %**start of header
  3. @setfilename ../../info/nxml-mode.info
  4. @settitle nXML Mode
  5. @include docstyle.texi
  6. @c %**end of header
  7. @copying
  8. This manual documents nXML mode, an Emacs major mode for editing
  9. XML with RELAX NG support.
  10. Copyright @copyright{} 2007--2017 Free Software Foundation, Inc.
  11. @quotation
  12. Permission is granted to copy, distribute and/or modify this document
  13. under the terms of the GNU Free Documentation License, Version 1.3 or
  14. any later version published by the Free Software Foundation; with no
  15. Invariant Sections, with the Front-Cover Texts being ``A GNU Manual,''
  16. and with the Back-Cover Texts as in (a) below. A copy of the license
  17. is included in the section entitled ``GNU Free Documentation License''.
  18. (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
  19. modify this GNU manual.''
  20. @end quotation
  21. @end copying
  22. @dircategory Emacs editing modes
  23. @direntry
  24. * nXML Mode: (nxml-mode). XML editing mode with RELAX NG support.
  25. @end direntry
  26. @titlepage
  27. @title nXML mode
  28. @page
  29. @vskip 0pt plus 1filll
  30. @insertcopying
  31. @end titlepage
  32. @contents
  33. @node Top
  34. @top nXML Mode
  35. @insertcopying
  36. This manual is not yet complete.
  37. @menu
  38. * Introduction::
  39. * Completion::
  40. * Inserting end-tags::
  41. * Paragraphs::
  42. * Outlining::
  43. * Locating a schema::
  44. * DTDs::
  45. * Limitations::
  46. * GNU Free Documentation License:: The license for this documentation.
  47. @end menu
  48. @node Introduction
  49. @chapter Introduction
  50. nXML mode is an Emacs major-mode for editing XML documents. It supports
  51. editing well-formed XML documents, and provides schema-sensitive editing
  52. using RELAX NG Compact Syntax. To get started, visit a file containing an
  53. XML document, and, if necessary, use @kbd{M-x nxml-mode} to switch to nXML
  54. mode. By default, @code{auto-mode-alist} and @code{magic-fallback-alist}
  55. put buffers in nXML mode if they have recognizable XML content or file
  56. extensions. You may wish to customize the settings, for example to
  57. recognize different file extensions.
  58. Once in nXML mode, you can type @kbd{C-h m} for basic information on the
  59. mode.
  60. The @file{etc/nxml} directory in the Emacs distribution contains some data
  61. files used by nXML mode, and includes two files (@file{test-valid.xml} and
  62. @file{test-invalid.xml}) that provide examples of valid and invalid XML
  63. documents.
  64. To get validation and schema-sensitive editing, you need a RELAX NG Compact
  65. Syntax (RNC) schema for your document (@pxref{Locating a schema}). The
  66. @file{etc/schema} directory includes some schemas for popular document
  67. types. See @url{http://relaxng.org/} for more information on RELAX NG@.
  68. You can use the @samp{Trang} program from
  69. @url{http://www.thaiopensource.com/relaxng/trang.html} to
  70. automatically create RNC schemas. This program can:
  71. @itemize @bullet
  72. @item
  73. infer an RNC schema from an instance document;
  74. @item
  75. convert a DTD to an RNC schema;
  76. @item
  77. convert a RELAX NG XML syntax schema to an RNC schema.
  78. @end itemize
  79. @noindent To convert a RELAX NG XML syntax (@samp{.rng}) schema to a RNC
  80. one, you can also use the XSLT stylesheet from
  81. @url{https://github.com/oleg-pavliv/emacs/tree/master/xsl}.
  82. @ignore
  83. @c Original location, now defunct.
  84. @url{http://www.pantor.com/download.html}.
  85. @end ignore
  86. To convert a W3C XML Schema to an RNC schema, you need first to convert it
  87. to RELAX NG XML syntax using the RELAX NG converter tool @code{rngconv}
  88. (built on top of MSV). See @url{https://github.com/kohsuke/msv}
  89. and @url{https://msv.dev.java.net/}.
  90. For historical discussions only, see the mailing list archives at
  91. @url{http://groups.yahoo.com/group/emacs-nxml-mode/}. Please make all new
  92. discussions on the @samp{help-gnu-emacs} and @samp{emacs-devel} mailing
  93. lists. Report any bugs with @kbd{M-x report-emacs-bug}.
  94. @node Completion
  95. @chapter Completion
  96. Apart from real-time validation, the most important feature that nXML
  97. mode provides for assisting in document creation is "completion".
  98. Completion assists the user in inserting characters at point, based on
  99. knowledge of the schema and on the contents of the buffer before
  100. point.
  101. nXML mode adapts the standard GNU Emacs command for completion in a
  102. buffer: @code{completion-at-point}, which is bound to @kbd{C-M-i} and
  103. @kbd{M-@key{TAB}}. Note that many window systems and window managers
  104. use @kbd{M-@key{TAB}} themselves (typically for switching between
  105. windows) and do not pass it to applications. In that case, you should
  106. type @kbd{C-M-i} or @kbd{@key{ESC} @key{TAB}} for completion, or bind
  107. @code{completion-at-point} to a key that is convenient for you. In
  108. the following, I will assume that you type @kbd{C-M-i}.
  109. nXML mode completion works by examining the symbol preceding point.
  110. This is the symbol to be completed. The symbol to be completed may be
  111. the empty. Completion considers what symbols starting with the symbol
  112. to be completed would be valid replacements for the symbol to be
  113. completed, given the schema and the contents of the buffer before
  114. point. These symbols are the possible completions. An example may
  115. make this clearer. Suppose the buffer looks like this (where @point{}
  116. indicates point):
  117. @example
  118. <html xmlns="http://www.w3.org/1999/xhtml">
  119. <h@point{}
  120. @end example
  121. @noindent
  122. and the schema is XHTML@. In this context, the symbol to be completed
  123. is @samp{h}. The possible completions consist of just
  124. @samp{head}. Another example, is
  125. @example
  126. <html xmlns="http://www.w3.org/1999/xhtml">
  127. <head>
  128. <@point{}
  129. @end example
  130. @noindent
  131. In this case, the symbol to be completed is empty, and the possible
  132. completions are @samp{base}, @samp{isindex},
  133. @samp{link}, @samp{meta}, @samp{script},
  134. @samp{style}, @samp{title}. Another example is:
  135. @example
  136. <html xmlns="@point{}
  137. @end example
  138. @noindent
  139. In this case, the symbol to be completed is empty, and the possible
  140. completions are just @samp{http://www.w3.org/1999/xhtml}.
  141. When you type @kbd{C-M-i}, what happens depends
  142. on what the set of possible completions are.
  143. @itemize @bullet
  144. @item
  145. If the set of completions is empty, nothing
  146. happens.
  147. @item
  148. If there is one possible completion, then that completion is
  149. inserted, together with any following characters that are
  150. required. For example, in this case:
  151. @example
  152. <html xmlns="http://www.w3.org/1999/xhtml">
  153. <@point{}
  154. @end example
  155. @noindent
  156. @kbd{C-M-i} will yield
  157. @example
  158. <html xmlns="http://www.w3.org/1999/xhtml">
  159. <head@point{}
  160. @end example
  161. @item
  162. If there is more than one possible completion, but all
  163. possible completions share a common non-empty prefix, then that prefix
  164. is inserted. For example, suppose the buffer is:
  165. @example
  166. <html x@point{}
  167. @end example
  168. @noindent
  169. The symbol to be completed is @samp{x}. The possible completions are
  170. @samp{xmlns} and @samp{xml:lang}. These share a common prefix of
  171. @samp{xml}. Thus, @kbd{C-M-i} will yield:
  172. @example
  173. <html xml@point{}
  174. @end example
  175. @noindent
  176. Typically, you would do @kbd{C-M-i} again, which would have the result
  177. described in the next item.
  178. @item
  179. If there is more than one possible completion, but the
  180. possible completions do not share a non-empty prefix, then Emacs will
  181. prompt you to input the symbol in the minibuffer, initializing the
  182. minibuffer with the symbol to be completed, and popping up a buffer
  183. showing the possible completions. You can now input the symbol to be
  184. inserted. The symbol you input will be inserted in the buffer instead
  185. of the symbol to be completed. Emacs will then insert any required
  186. characters after the symbol. For example, if it contains:
  187. @example
  188. <html xml@point{}
  189. @end example
  190. @noindent
  191. Emacs will prompt you in the minibuffer with
  192. @example
  193. Attribute: xml@point{}
  194. @end example
  195. @noindent
  196. and the buffer showing possible completions will contain
  197. @example
  198. Possible completions are:
  199. xml:lang xmlns
  200. @end example
  201. @noindent
  202. If you input @kbd{xmlns}, the result will be:
  203. @example
  204. <html xmlns="@point{}
  205. @end example
  206. @noindent
  207. (If you do @kbd{C-M-i} again, the namespace URI will be
  208. inserted. Should that happen automatically?)
  209. @end itemize
  210. @node Inserting end-tags
  211. @chapter Inserting end-tags
  212. The main redundancy in XML syntax is end-tags. nXML mode provides
  213. several ways to make it easier to enter end-tags. You can use all of
  214. these without a schema.
  215. You can use @kbd{C-M-i} after @samp{</} to complete the rest of the
  216. end-tag.
  217. @kbd{C-c C-f} inserts an end-tag for the element containing
  218. point. This command is useful when you want to input the start-tag,
  219. then input the content and finally input the end-tag. The @samp{f}
  220. is mnemonic for finish.
  221. If you want to keep tags balanced and input the end-tag at the
  222. same time as the start-tag, before inputting the content, then you can
  223. use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
  224. the end-tag and leaves point before the end-tag. @kbd{C-c C-b}
  225. is similar but more convenient for block-level elements: it puts the
  226. start-tag, point and the end-tag on successive lines, appropriately
  227. indented. The @samp{i} is mnemonic for inline and the
  228. @samp{b} is mnemonic for block.
  229. Finally, you can customize nXML mode so that @kbd{/} automatically
  230. inserts the rest of the end-tag when it occurs after @samp{<}, by
  231. doing
  232. @display
  233. @kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
  234. @end display
  235. @noindent
  236. and then following the instructions in the displayed buffer.
  237. @node Paragraphs
  238. @chapter Paragraphs
  239. Emacs has several commands that operate on paragraphs, most
  240. notably @kbd{M-q}. nXML mode redefines these to work in a way
  241. that is useful for XML@. The exact rules that are used to find the
  242. beginning and end of a paragraph are complicated; they are designed
  243. mainly to ensure that @kbd{M-q} does the right thing.
  244. A paragraph consists of one or more complete, consecutive lines.
  245. A group of lines is not considered a paragraph unless it contains some
  246. non-whitespace characters between tags or inside comments. A blank
  247. line separates paragraphs. A single tag on a line by itself also
  248. separates paragraphs. More precisely, if one tag together with any
  249. leading and trailing whitespace completely occupy one or more lines,
  250. then those lines will not be included in any paragraph.
  251. A start-tag at the beginning of the line (possibly indented) may
  252. be treated as starting a paragraph. Similarly, an end-tag at the end
  253. of the line may be treated as ending a paragraph. The following rules
  254. are used to determine whether such a tag is in fact treated as a
  255. paragraph boundary:
  256. @itemize @bullet
  257. @item
  258. If the schema does not allow text at that point, then it
  259. is a paragraph boundary.
  260. @item
  261. If the end-tag corresponding to the start-tag is not at
  262. the end of its line, or the start-tag corresponding to the end-tag is
  263. not at the beginning of its line, then it is not a paragraph
  264. boundary. For example, in
  265. @example
  266. <p>This is a paragraph with an
  267. <emph>emphasized</emph> phrase.
  268. @end example
  269. @noindent
  270. the @samp{<emph>} start-tag would not be considered as
  271. starting a paragraph, because its corresponding end-tag is not at the
  272. end of the line.
  273. @item
  274. If there is text that is a sibling in element tree, then
  275. it is not a paragraph boundary. For example, in
  276. @example
  277. <p>This is a paragraph with an
  278. <emph>emphasized phrase that takes one source line</emph>
  279. @end example
  280. @noindent
  281. the @samp{<emph>} start-tag would not be considered as
  282. starting a paragraph, even though its end-tag is at the end of its
  283. line, because there the text @samp{This is a paragraph with an}
  284. is a sibling of the @samp{emph} element.
  285. @item
  286. Otherwise, it is a paragraph boundary.
  287. @end itemize
  288. @node Outlining
  289. @chapter Outlining
  290. nXML mode allows you to display all or part of a buffer as an
  291. outline, in a similar way to Emacs's outline mode. An outline in nXML
  292. mode is based on recognizing two kinds of element: sections and
  293. headings. There is one heading for every section and one section for
  294. every heading. A section contains its heading as or within its first
  295. child element. A section also contains its subordinate sections (its
  296. subsections). The text content of a section consists of anything in a
  297. section that is neither a subsection nor a heading.
  298. Note that this is a different model from that used by XHTML@.
  299. nXML mode's outline support will not be useful for XHTML unless you
  300. adopt a convention of adding a @code{div} to enclose each
  301. section, rather than having sections implicitly delimited by different
  302. @code{h@var{n}} elements. This limitation may be removed
  303. in a future version.
  304. The variable @code{nxml-section-element-name-regexp} gives
  305. a regexp for the local names (i.e., the part of the name following any
  306. prefix) of section elements. The variable
  307. @code{nxml-heading-element-name-regexp} gives a regexp for the
  308. local names of heading elements. For an element to be recognized
  309. as a section
  310. @itemize @bullet
  311. @item
  312. its start-tag must occur at the beginning of a line
  313. (possibly indented);
  314. @item
  315. its local name must match
  316. @code{nxml-section-element-name-regexp};
  317. @item
  318. either its first child element or a descendant of that
  319. first child element must have a local name that matches
  320. @code{nxml-heading-element-name-regexp}; the first such element
  321. is treated as the section's heading.
  322. @end itemize
  323. @noindent
  324. You can customize these variables using @kbd{M-x
  325. customize-variable}.
  326. There are three possible outline states for a section:
  327. @itemize @bullet
  328. @item
  329. normal, showing everything, including its heading, text
  330. content and subsections; each subsection is displayed according to the
  331. state of that subsection;
  332. @item
  333. showing just its heading, with both its text content and
  334. its subsections hidden; all subsections are hidden regardless of their
  335. state;
  336. @item
  337. showing its heading and its subsections, with its text
  338. content hidden; each subsection is displayed according to the state of
  339. that subsection.
  340. @end itemize
  341. In the last two states, where the text content is hidden, the
  342. heading is displayed specially, in an abbreviated form. An element
  343. like this:
  344. @example
  345. <section>
  346. <title>Food</title>
  347. <para>There are many kinds of food.</para>
  348. </section>
  349. @end example
  350. @noindent
  351. would be displayed on a single line like this:
  352. @example
  353. <-section>Food...</>
  354. @end example
  355. @noindent
  356. If there are hidden subsections, then a @code{+} will be used
  357. instead of a @code{-} like this:
  358. @example
  359. <+section>Food...</>
  360. @end example
  361. @noindent
  362. If there are non-hidden subsections, then the section will instead be
  363. displayed like this:
  364. @example
  365. <-section>Food...
  366. <-section>Delicious Food...</>
  367. <-section>Distasteful Food...</>
  368. </-section>
  369. @end example
  370. @noindent
  371. The heading is always displayed with an indent that corresponds to its
  372. depth in the outline, even it is not actually indented in the buffer.
  373. The variable @code{nxml-outline-child-indent} controls how much
  374. a subheading is indented with respect to its parent heading when the
  375. heading is being displayed specially.
  376. Commands to change the outline state of sections are bound to
  377. key sequences that start with @kbd{C-c C-o} (@kbd{o} is
  378. mnemonic for outline). The third and final key has been chosen to be
  379. consistent with outline mode. In the following descriptions
  380. current section means the section containing point, or, more precisely,
  381. the innermost section containing the character immediately following
  382. point.
  383. @itemize @bullet
  384. @item
  385. @kbd{C-c C-o C-a} shows all sections in the buffer
  386. normally.
  387. @item
  388. @kbd{C-c C-o C-t} hides the text content
  389. of all sections in the buffer.
  390. @item
  391. @kbd{C-c C-o C-c} hides the text content
  392. of the current section.
  393. @item
  394. @kbd{C-c C-o C-e} shows the text content
  395. of the current section.
  396. @item
  397. @kbd{C-c C-o C-d} hides the text content
  398. and subsections of the current section.
  399. @item
  400. @kbd{C-c C-o C-s} shows the current section
  401. and all its direct and indirect subsections normally.
  402. @item
  403. @kbd{C-c C-o C-k} shows the headings of the
  404. direct and indirect subsections of the current section.
  405. @item
  406. @kbd{C-c C-o C-l} hides the text content of the
  407. current section and of its direct and indirect
  408. subsections.
  409. @item
  410. @kbd{C-c C-o C-i} shows the headings of the
  411. direct subsections of the current section.
  412. @item
  413. @kbd{C-c C-o C-o} hides as much as possible without
  414. hiding the current section's text content; the headings of ancestor
  415. sections of the current section and their child section sections will
  416. not be hidden.
  417. @end itemize
  418. When a heading is displayed specially, you can use
  419. @key{RET} in that heading to show the text content of the section
  420. in the same way as @kbd{C-c C-o C-e}.
  421. You can also use the mouse to change the outline state:
  422. @kbd{S-mouse-2} hides the text content of a section in the same
  423. way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
  424. displayed heading shows the text content of the section in the same
  425. way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
  426. displayed start-tag toggles the display of subheadings on and
  427. off.
  428. The outline state for each section is stored with the first
  429. character of the section (as a text property). Every command that
  430. changes the outline state of any section updates the display of the
  431. buffer so that each section is displayed correctly according to its
  432. outline state. If the section structure is subsequently changed, then
  433. it is possible for the display to no longer correctly reflect the
  434. stored outline state. @kbd{C-c C-o C-r} can be used to refresh
  435. the display so it is correct again.
  436. @node Locating a schema
  437. @chapter Locating a schema
  438. nXML mode has a configurable set of rules to locate a schema for
  439. the file being edited. The rules are contained in one or more schema
  440. locating files, which are XML documents.
  441. The variable @samp{rng-schema-locating-files} specifies
  442. the list of the file-names of schema locating files that nXML mode
  443. should use. The order of the list is significant: when file
  444. @var{x} occurs in the list before file @var{y} then rules
  445. from file @var{x} have precedence over rules from file
  446. @var{y}. A filename specified in
  447. @samp{rng-schema-locating-files} may be relative. If so, it will
  448. be resolved relative to the document for which a schema is being
  449. located. It is not an error if relative file-names in
  450. @samp{rng-schema-locating-files} do not exist. You can use
  451. @kbd{M-x customize-variable @key{RET} rng-schema-locating-files
  452. @key{RET}} to customize the list of schema locating
  453. files.
  454. By default, @samp{rng-schema-locating-files} list has two
  455. members: @samp{schemas.xml}, and
  456. @samp{@var{dist-dir}/schema/schemas.xml} where
  457. @samp{@var{dist-dir}} is the directory containing the nXML
  458. distribution. The first member will cause nXML mode to use a file
  459. @samp{schemas.xml} in the same directory as the document being
  460. edited if such a file exist. The second member contains rules for the
  461. schemas that are included with the nXML distribution.
  462. @menu
  463. * Commands for locating a schema::
  464. * Schema locating files::
  465. @end menu
  466. @node Commands for locating a schema
  467. @section Commands for locating a schema
  468. The command @kbd{C-c C-s C-w} will tell you what schema
  469. is currently being used.
  470. The rules for locating a schema are applied automatically when
  471. you visit a file in nXML mode. However, if you have just created a new
  472. file and the schema cannot be inferred from the file-name, then this
  473. will not locate the right schema. In this case, you should insert the
  474. start-tag of the root element and then use the command @kbd{C-c C-s
  475. C-a}, which reapplies the rules based on the current content of
  476. the document. It is usually not necessary to insert the complete
  477. start-tag; often just @samp{<@var{name}} is
  478. enough.
  479. If you want to use a schema that has not yet been added to the
  480. schema locating files, you can use the command @kbd{C-c C-s C-f}
  481. to manually select the file containing the schema for the document in
  482. current buffer. Emacs will read the file-name of the schema from the
  483. minibuffer. After reading the file-name, Emacs will ask whether you
  484. wish to add a rule to a schema locating file that persistently
  485. associates the document with the selected schema. The rule will be
  486. added to the first file in the list specified
  487. @samp{rng-schema-locating-files}; it will create the file if
  488. necessary, but will not create a directory. If the variable
  489. @samp{rng-schema-locating-files} has not been customized, this
  490. means that the rule will be added to the file @samp{schemas.xml}
  491. in the same directory as the document being edited.
  492. The command @kbd{C-c C-s C-t} allows you to select a schema by
  493. specifying an identifier for the type of the document. The schema
  494. locating files determine the available type identifiers and what
  495. schema is used for each type identifier. This is useful when it is
  496. impossible to infer the right schema from either the file-name or the
  497. content of the document, even though the schema is already in the
  498. schema locating file. A situation in which this can occur is when
  499. there are multiple variants of a schema where all valid documents have
  500. the same document element. For example, XHTML has Strict and
  501. Transitional variants. In a situation like this, a schema locating file
  502. can define a type identifier for each variant. As with @kbd{C-c
  503. C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
  504. locating file that persistently associates the document with the
  505. specified type identifier.
  506. The command @kbd{C-c C-s C-l} adds a rule to a schema
  507. locating file that persistently associates the document with
  508. the schema that is currently being used.
  509. @node Schema locating files
  510. @section Schema locating files
  511. Each schema locating file specifies a list of rules. The rules
  512. from each file are appended in order. To locate a schema each rule is
  513. applied in turn until a rule matches. The first matching rule is then
  514. used to determine the schema.
  515. Schema locating files are designed to be useful for other
  516. applications that need to locate a schema for a document. In fact,
  517. there is nothing specific to locating schemas in the design; it could
  518. equally well be used for locating a stylesheet.
  519. @menu
  520. * Schema locating file syntax basics::
  521. * Using the document's URI to locate a schema::
  522. * Using the document element to locate a schema::
  523. * Using type identifiers in schema locating files::
  524. * Using multiple schema locating files::
  525. @end menu
  526. @node Schema locating file syntax basics
  527. @subsection Schema locating file syntax basics
  528. There is a schema for schema locating files in the file
  529. @samp{locate.rnc} in the schema directory. Schema locating
  530. files must be valid with respect to this schema.
  531. The document element of a schema locating file must be
  532. @samp{locatingRules} and the namespace URI must be
  533. @samp{http://thaiopensource.com/ns/locating-rules/1.0}. The
  534. children of the document element specify rules. The order of the
  535. children is the same as the order of the rules. Here's a complete
  536. example of a schema locating file:
  537. @example
  538. <?xml version="1.0"?>
  539. <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
  540. <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
  541. <documentElement localName="book" uri="docbook.rnc"/>
  542. </locatingRules>
  543. @end example
  544. @noindent
  545. This says to use the schema @samp{xhtml.rnc} for a document with
  546. namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
  547. schema @samp{docbook.rnc} for a document whose local name is
  548. @samp{book}. If the document element had both a namespace URI
  549. of @samp{http://www.w3.org/1999/xhtml} and a local name of
  550. @samp{book}, then the matching rule that comes first will be
  551. used and so the schema @samp{xhtml.rnc} would be used. There is
  552. no precedence between different types of rule; the first matching rule
  553. of any type is used.
  554. As usual with XML-related technologies, resources are identified
  555. by URIs. The @samp{uri} attribute identifies the schema by
  556. specifying the URI@. The URI may be relative. If so, it is resolved
  557. relative to the URI of the schema locating file that contains
  558. attribute. This means that if the value of @samp{uri} attribute
  559. does not contain a @samp{/}, then it will refer to a filename in
  560. the same directory as the schema locating file.
  561. @node Using the document's URI to locate a schema
  562. @subsection Using the document's URI to locate a schema
  563. A @samp{uri} rule locates a schema based on the URI of the
  564. document. The @samp{uri} attribute specifies the URI of the
  565. schema. The @samp{resource} attribute can be used to specify
  566. the schema for a particular document. For example,
  567. @example
  568. <uri resource="spec.xml" uri="docbook.rnc"/>
  569. @end example
  570. @noindent
  571. specifies that the schema for @samp{spec.xml} is
  572. @samp{docbook.rnc}.
  573. The @samp{pattern} attribute can be used instead of the
  574. @samp{resource} attribute to specify the schema for any document
  575. whose URI matches a pattern. The pattern has the same syntax as an
  576. absolute or relative URI except that the path component of the URI can
  577. use a @samp{*} character to stand for zero or more characters
  578. within a path segment (i.e., any character other @samp{/}).
  579. Typically, the URI pattern looks like a relative URI, but, whereas a
  580. relative URI in the @samp{resource} attribute is resolved into a
  581. particular absolute URI using the base URI of the schema locating
  582. file, a relative URI pattern matches if it matches some number of
  583. complete path segments of the document's URI ending with the last path
  584. segment of the document's URI@. For example,
  585. @example
  586. <uri pattern="*.xsl" uri="xslt.rnc"/>
  587. @end example
  588. @noindent
  589. specifies that the schema for documents with a URI whose path ends
  590. with @samp{.xsl} is @samp{xslt.rnc}.
  591. A @samp{transformURI} rule locates a schema by
  592. transforming the URI of the document. The @samp{fromPattern}
  593. attribute specifies a URI pattern with the same meaning as the
  594. @samp{pattern} attribute of the @samp{uri} element. The
  595. @samp{toPattern} attribute is a URI pattern that is used to
  596. generate the URI of the schema. Each @samp{*} in the
  597. @samp{toPattern} is replaced by the string that matched the
  598. corresponding @samp{*} in the @samp{fromPattern}. The
  599. resulting string is appended to the initial part of the document's URI
  600. that was not explicitly matched by the @samp{fromPattern}. The
  601. rule matches only if the transformed URI identifies an existing
  602. resource. For example, the rule
  603. @example
  604. <transformURI fromPattern="*.xml" toPattern="*.rnc"/>
  605. @end example
  606. @noindent
  607. would transform the URI @samp{file:///home/jjc/docs/spec.xml}
  608. into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this
  609. rule specifies that to locate a schema for a document
  610. @samp{@var{foo}.xml}, Emacs should test whether a file
  611. @samp{@var{foo}.rnc} exists in the same directory as
  612. @samp{@var{foo}.xml}, and, if so, should use it as the
  613. schema.
  614. @node Using the document element to locate a schema
  615. @subsection Using the document element to locate a schema
  616. A @samp{documentElement} rule locates a schema based on
  617. the local name and prefix of the document element. For example, a rule
  618. @example
  619. <documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
  620. @end example
  621. @noindent
  622. specifies that when the name of the document element is
  623. @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
  624. as the schema. Either the @samp{prefix} or
  625. @samp{localName} attribute may be omitted to allow any prefix or
  626. local name.
  627. A @samp{namespace} rule locates a schema based on the
  628. namespace URI of the document element. For example, a rule
  629. @example
  630. <namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
  631. @end example
  632. @noindent
  633. specifies that when the namespace URI of the document is
  634. @samp{http://www.w3.org/1999/XSL/Transform}, then
  635. @samp{xslt.rnc} should be used as the schema.
  636. @node Using type identifiers in schema locating files
  637. @subsection Using type identifiers in schema locating files
  638. Type identifiers allow a level of indirection in locating the
  639. schema for a document. Instead of associating the document directly
  640. with a schema URI, the document is associated with a type identifier,
  641. which is in turn associated with a schema URI@. nXML mode does not
  642. constrain the format of type identifiers. They can be simply strings
  643. without any formal structure or they can be public identifiers or
  644. URIs. Note that these type identifiers have nothing to do with the
  645. DOCTYPE declaration. When comparing type identifiers, whitespace is
  646. normalized in the same way as with the @samp{xsd:token}
  647. datatype: leading and trailing whitespace is stripped; other sequences
  648. of whitespace are normalized to a single space character.
  649. Each of the rules described in previous sections that uses a
  650. @samp{uri} attribute to specify a schema, can instead use a
  651. @samp{typeId} attribute to specify a type identifier. The type
  652. identifier can be associated with a URI using a @samp{typeId}
  653. element. For example,
  654. @example
  655. <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
  656. <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
  657. <typeId id="XHTML" typeId="XHTML Strict"/>
  658. <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
  659. <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
  660. </locatingRules>
  661. @end example
  662. @noindent
  663. declares three type identifiers @samp{XHTML} (representing the
  664. default variant of XHTML to be used), @samp{XHTML Strict} and
  665. @samp{XHTML Transitional}. Such a schema locating file would
  666. use @samp{xhtml-strict.rnc} for a document whose namespace is
  667. @samp{http://www.w3.org/1999/xhtml}. But it is considerably
  668. more flexible than a schema locating file that simply specified
  669. @example
  670. <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
  671. @end example
  672. @noindent
  673. A user can easily use @kbd{C-c C-s C-t} to select between XHTML
  674. Strict and XHTML Transitional. Also, a user can easily add a catalog
  675. @example
  676. <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
  677. <typeId id="XHTML" typeId="XHTML Transitional"/>
  678. </locatingRules>
  679. @end example
  680. @noindent
  681. that makes the default variant of XHTML be XHTML Transitional.
  682. @node Using multiple schema locating files
  683. @subsection Using multiple schema locating files
  684. The @samp{include} element includes rules from another
  685. schema locating file. The behavior is exactly as if the rules from
  686. that file were included in place of the @samp{include} element.
  687. Relative URIs are resolved into absolute URIs before the inclusion is
  688. performed. For example,
  689. @example
  690. <include rules="../rules.xml"/>
  691. @end example
  692. @noindent
  693. includes the rules from @samp{rules.xml}.
  694. The process of locating a schema takes as input a list of schema
  695. locating files. The rules in all these files and in the files they
  696. include are resolved into a single list of rules, which are applied
  697. strictly in order. Sometimes this order is not what is needed.
  698. For example, suppose you have two schema locating files, a private
  699. file
  700. @example
  701. <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
  702. <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
  703. </locatingRules>
  704. @end example
  705. @noindent
  706. followed by a public file
  707. @example
  708. <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
  709. <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
  710. <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
  711. </locatingRules>
  712. @end example
  713. @noindent
  714. The effect of these two files is that the XHTML @samp{namespace}
  715. rule takes precedence over the @samp{transformURI} rule, which
  716. is almost certainly not what is needed. This can be solved by adding
  717. an @samp{applyFollowingRules} to the private file.
  718. @example
  719. <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
  720. <applyFollowingRules ruleType="transformURI"/>
  721. <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
  722. </locatingRules>
  723. @end example
  724. @node DTDs
  725. @chapter DTDs
  726. nXML mode is designed to support the creation of standalone XML
  727. documents that do not depend on a DTD@. Although it is common practice
  728. to insert a DOCTYPE declaration referencing an external DTD, this has
  729. undesirable side-effects. It means that the document is no longer
  730. self-contained. It also means that different XML parsers may interpret
  731. the document in different ways, since the XML Recommendation does not
  732. require XML parsers to read the DTD@. With DTDs, it was impractical to
  733. get validation without using an external DTD or reference to an
  734. parameter entity. With RELAX NG and other schema languages, you can
  735. simultaneously get the benefits of validation and standalone XML
  736. documents. Therefore, I recommend that you do not reference an
  737. external DOCTYPE in your XML documents.
  738. One problem is entities for characters. Typically, as well as
  739. providing validation, DTDs also provide a set of character entities
  740. for documents to use. Schemas cannot provide this functionality,
  741. because schema validation happens after XML parsing. The recommended
  742. solution is to either use the Unicode characters directly, or, if this
  743. is impractical, use character references. nXML mode supports this by
  744. providing commands for entering characters and character references
  745. using the Unicode names, and can display the glyph corresponding to a
  746. character reference.
  747. @node Limitations
  748. @chapter Limitations
  749. nXML mode has some limitations:
  750. @itemize @bullet
  751. @item
  752. DTD support is limited. Internal parsed general entities declared
  753. in the internal subset are supported provided they do not contain
  754. elements. Other usage of DTDs is ignored.
  755. @item
  756. The restrictions on RELAX NG schemas in section 7 of the RELAX NG
  757. specification are not enforced.
  758. @end itemize
  759. @node GNU Free Documentation License
  760. @appendix GNU Free Documentation License
  761. @include doclicense.texi
  762. @bye