emacs-mime.texi 59 KB


  1. \input texinfo
  2. @include gnus-overrides.texi
  3. @setfilename ../../info/emacs-mime.info
  4. @settitle Emacs MIME Manual
  5. @include docstyle.texi
  6. @synindex fn cp
  7. @synindex vr cp
  8. @synindex pg cp
  9. @copying
  10. This file documents the Emacs MIME interface functionality.
  11. Copyright @copyright{} 1998--2017 Free Software Foundation, Inc.
  12. @quotation
  13. Permission is granted to copy, distribute and/or modify this document
  14. under the terms of the GNU Free Documentation License, Version 1.3 or
  15. any later version published by the Free Software Foundation; with no
  16. Invariant Sections, with the Front-Cover Texts being ``A GNU Manual'',
  17. and with the Back-Cover Texts as in (a) below. A copy of the license
  18. is included in the section entitled ``GNU Free Documentation License''.
  19. (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
  20. modify this GNU manual.''
  21. @end quotation
  22. @end copying
  23. @c Node ``Interface Functions'' uses non-ASCII characters
  24. @dircategory Emacs lisp libraries
  25. @direntry
  26. * Emacs MIME: (emacs-mime). Emacs MIME de/composition library.
  27. @end direntry
  28. @iftex
  29. @finalout
  30. @end iftex
  31. @setchapternewpage odd
  32. @titlepage
  33. @ifset WEBHACKDEVEL
  34. @title Emacs MIME Manual (DEVELOPMENT VERSION)
  35. @end ifset
  36. @ifclear WEBHACKDEVEL
  37. @title Emacs MIME Manual
  38. @end ifclear
  39. @author by Lars Magne Ingebrigtsen
  40. @page
  41. @vskip 0pt plus 1filll
  42. @insertcopying
  43. @end titlepage
  44. @contents
  45. @node Top
  46. @top Emacs MIME
  47. This manual documents the libraries used to compose and display
  48. @acronym{MIME} messages.
  49. This manual is directed at users who want to modify the behavior of
  50. the @acronym{MIME} encoding/decoding process or want a more detailed
  51. picture of how the Emacs @acronym{MIME} library works, and people who want
  52. to write functions and commands that manipulate @acronym{MIME} elements.
  53. @acronym{MIME} is short for @dfn{Multipurpose Internet Mail Extensions}.
  54. This standard is documented in a number of RFCs; mainly RFC2045 (Format
  55. of Internet Message Bodies), RFC2046 (Media Types), RFC2047 (Message
  56. Header Extensions for Non-@acronym{ASCII} Text), RFC2048 (Registration
  57. Procedures), RFC2049 (Conformance Criteria and Examples). It is highly
  58. recommended that anyone who intends writing @acronym{MIME}-compliant software
  59. read at least RFC2045 and RFC2047.
  60. @ifnottex
  61. @insertcopying
  62. @end ifnottex
  63. @menu
  64. * Decoding and Viewing:: A framework for decoding and viewing.
  65. * Composing:: @acronym{MML}; a language for describing @acronym{MIME} parts.
  66. * Interface Functions:: An abstraction over the basic functions.
  67. * Basic Functions:: Utility and basic parsing functions.
  68. * Standards:: A summary of RFCs and working documents used.
  69. * GNU Free Documentation License:: The license for this documentation.
  70. * Index:: Function and variable index.
  71. @end menu
  72. @node Decoding and Viewing
  73. @chapter Decoding and Viewing
  74. This chapter deals with decoding and viewing @acronym{MIME} messages on a
  75. higher level.
  76. The main idea is to first analyze a @acronym{MIME} article, and then allow
  77. other programs to do things based on the list of @dfn{handles} that are
  78. returned as a result of this analysis.
  79. @menu
  80. * Dissection:: Analyzing a @acronym{MIME} message.
  81. * Non-MIME:: Analyzing a non-@acronym{MIME} message.
  82. * Handles:: Handle manipulations.
  83. * Display:: Displaying handles.
  84. * Display Customization:: Variables that affect display.
  85. * Files and Directories:: Saving and naming attachments.
  86. * New Viewers:: How to write your own viewers.
  87. @end menu
  88. @node Dissection
  89. @section Dissection
  90. The @code{mm-dissect-buffer} is the function responsible for dissecting
  91. a @acronym{MIME} article. If given a multipart message, it will recursively
  92. descend the message, following the structure, and return a tree of
  93. @acronym{MIME} handles that describes the structure of the message.
  94. @node Non-MIME
  95. @section Non-MIME
  96. @vindex mm-uu-configure-list
  97. Gnus also understands some non-@acronym{MIME} attachments, such as
  98. postscript, uuencode, binhex, yenc, shar, forward, gnatsweb, pgp,
  99. diff. Each of these features can be disabled by add an item into
  100. @code{mm-uu-configure-list}. For example,
  101. @lisp
  102. (require 'mm-uu)
  103. (add-to-list 'mm-uu-configure-list '(pgp-signed . disabled))
  104. @end lisp
  105. @table @code
  106. @item postscript
  107. @findex postscript
  108. PostScript file.
  109. @item uu
  110. @findex uu
  111. Uuencoded file.
  112. @item binhex
  113. @findex binhex
  114. Binhex encoded file.
  115. @item yenc
  116. @findex yenc
  117. Yenc encoded file.
  118. @item shar
  119. @findex shar
  120. Shar archive file.
  121. @item forward
  122. @findex forward
  123. Non-@acronym{MIME} forwarded message.
  124. @item gnatsweb
  125. @findex gnatsweb
  126. Gnatsweb attachment.
  127. @item pgp-signed
  128. @findex pgp-signed
  129. @acronym{PGP} signed clear text.
  130. @item pgp-encrypted
  131. @findex pgp-encrypted
  132. @acronym{PGP} encrypted clear text.
  133. @item pgp-key
  134. @findex pgp-key
  135. @acronym{PGP} public keys.
  136. @item emacs-sources
  137. @findex emacs-sources
  138. @vindex mm-uu-emacs-sources-regexp
  139. Emacs source code. This item works only in the groups matching
  140. @code{mm-uu-emacs-sources-regexp}.
  141. @item diff
  142. @vindex diff
  143. @vindex mm-uu-diff-groups-regexp
  144. Patches. This is intended for groups where diffs of committed files
  145. are automatically sent to. It only works in groups matching
  146. @code{mm-uu-diff-groups-regexp}.
  147. @item verbatim-marks
  148. @cindex verbatim-marks
  149. Slrn-style verbatim marks.
  150. @item LaTeX
  151. @cindex LaTeX
  152. LaTeX documents. It only works in groups matching
  153. @code{mm-uu-tex-groups-regexp}.
  154. @end table
  155. @cindex text/x-verbatim
  156. @c Is @vindex suitable for a face?
  157. @vindex mm-uu-extract
  158. Some inlined non-@acronym{MIME} attachments are displayed using the face
  159. @code{mm-uu-extract}. By default, no @acronym{MIME} button for these
  160. parts is displayed. You can force displaying a button using @kbd{K b}
  161. (@code{gnus-summary-display-buttonized}) or add @code{text/x-verbatim}
  162. to @code{gnus-buttonized-mime-types}, @xref{MIME Commands, ,MIME
  163. Commands, gnus, Gnus Manual}.
  164. @node Handles
  165. @section Handles
  166. A @acronym{MIME} handle is a list that fully describes a @acronym{MIME}
  167. component.
  168. The following macros can be used to access elements in a handle:
  169. @table @code
  170. @item mm-handle-buffer
  171. @findex mm-handle-buffer
  172. Return the buffer that holds the contents of the undecoded @acronym{MIME}
  173. part.
  174. @item mm-handle-type
  175. @findex mm-handle-type
  176. Return the parsed @code{Content-Type} of the part.
  177. @item mm-handle-encoding
  178. @findex mm-handle-encoding
  179. Return the @code{Content-Transfer-Encoding} of the part.
  180. @item mm-handle-undisplayer
  181. @findex mm-handle-undisplayer
  182. Return the object that can be used to remove the displayed part (if it
  183. has been displayed).
  184. @item mm-handle-set-undisplayer
  185. @findex mm-handle-set-undisplayer
  186. Set the undisplayer object.
  187. @item mm-handle-disposition
  188. @findex mm-handle-disposition
  189. Return the parsed @code{Content-Disposition} of the part.
  190. @item mm-get-content-id
  191. Returns the handle(s) referred to by @code{Content-ID}.
  192. @end table
  193. @node Display
  194. @section Display
  195. Functions for displaying, removing and saving.
  196. @table @code
  197. @item mm-display-part
  198. @findex mm-display-part
  199. Display the part.
  200. @item mm-remove-part
  201. @findex mm-remove-part
  202. Remove the part (if it has been displayed).
  203. @item mm-inlinable-p
  204. @findex mm-inlinable-p
  205. Say whether a @acronym{MIME} type can be displayed inline.
  206. @item mm-automatic-display-p
  207. @findex mm-automatic-display-p
  208. Say whether a @acronym{MIME} type should be displayed automatically.
  209. @item mm-destroy-part
  210. @findex mm-destroy-part
  211. Free all resources occupied by a part.
  212. @item mm-save-part
  213. @findex mm-save-part
  214. Offer to save the part in a file.
  215. @item mm-pipe-part
  216. @findex mm-pipe-part
  217. Offer to pipe the part to some process.
  218. @item mm-interactively-view-part
  219. @findex mm-interactively-view-part
  220. Prompt for a mailcap method to use to view the part.
  221. @end table
  222. @node Display Customization
  223. @section Display Customization
  224. @table @code
  225. @item mm-inline-media-tests
  226. @vindex mm-inline-media-tests
  227. This is an alist where the key is a @acronym{MIME} type, the second element
  228. is a function to display the part @dfn{inline} (i.e., inside Emacs), and
  229. the third element is a form to be @code{eval}ed to say whether the part
  230. can be displayed inline.
  231. This variable specifies whether a part @emph{can} be displayed inline,
  232. and, if so, how to do it. It does not say whether parts are
  233. @emph{actually} displayed inline.
  234. @item mm-inlined-types
  235. @vindex mm-inlined-types
  236. This, on the other hand, says what types are to be displayed inline, if
  237. they satisfy the conditions set by the variable above. It's a list of
  238. @acronym{MIME} media types.
  239. @item mm-automatic-display
  240. @vindex mm-automatic-display
  241. This is a list of types that are to be displayed ``automatically'', but
  242. only if the above variable allows it. That is, only inlinable parts can
  243. be displayed automatically.
  244. @item mm-automatic-external-display
  245. @vindex mm-automatic-external-display
  246. This is a list of types that will be displayed automatically in an
  247. external viewer.
  248. @item mm-keep-viewer-alive-types
  249. @vindex mm-keep-viewer-alive-types
  250. This is a list of media types for which the external viewer will not
  251. be killed when selecting a different article.
  252. @item mm-attachment-override-types
  253. @vindex mm-attachment-override-types
  254. Some @acronym{MIME} agents create parts that have a content-disposition of
  255. @samp{attachment}. This variable allows overriding that disposition and
  256. displaying the part inline. (Note that the disposition is only
  257. overridden if we are able to, and want to, display the part inline.)
  258. @item mm-discouraged-alternatives
  259. @vindex mm-discouraged-alternatives
  260. List of @acronym{MIME} types that are discouraged when viewing
  261. @samp{multipart/alternative}. Viewing agents are supposed to view the
  262. last possible part of a message, as that is supposed to be the richest.
  263. However, users may prefer other types instead, and this list says what
  264. types are most unwanted. If, for instance, @samp{text/html} parts are
  265. very unwanted, and @samp{text/richtext} parts are somewhat unwanted,
  266. you could say something like:
  267. @lisp
  268. (setq mm-discouraged-alternatives
  269. '("text/html" "text/richtext")
  270. mm-automatic-display
  271. (remove "text/html" mm-automatic-display))
  272. @end lisp
  273. Adding @code{"image/.*"} might also be useful. Spammers use images as
  274. the preferred part of @samp{multipart/alternative} messages, so you might
  275. not notice there are other parts. See also
  276. @code{gnus-buttonized-mime-types}, @ref{MIME Commands, ,MIME Commands,
  277. gnus, Gnus Manual}. After adding @code{"multipart/alternative"} to
  278. @code{gnus-buttonized-mime-types} you can choose manually which
  279. alternative you'd like to view. For example, you can set those
  280. variables like:
  281. @lisp
  282. (setq gnus-buttonized-mime-types
  283. '("multipart/alternative" "multipart/signed")
  284. mm-discouraged-alternatives
  285. '("text/html" "image/.*"))
  286. @end lisp
  287. In this case, Gnus will display radio buttons for such a kind of spam
  288. message as follows:
  289. @example
  290. 1. (*) multipart/alternative ( ) image/gif
  291. 2. (*) text/plain ( ) text/html
  292. @end example
  293. @item mm-inline-large-images
  294. @vindex mm-inline-large-images
  295. When displaying inline images that are larger than the window, Emacs
  296. does not enable scrolling, which means that you cannot see the whole
  297. image. To prevent this, the library tries to determine the image size
  298. before displaying it inline, and if it doesn't fit the window, the
  299. library will display it externally (e.g., with @samp{ImageMagick} or
  300. @samp{xv}). Setting this variable to @code{t} disables this check and
  301. makes the library display all inline images as inline, regardless of
  302. their size. If you set this variable to @code{resize}, the image will
  303. be displayed resized to fit in the window, if Emacs has the ability to
  304. resize images.
  305. @item mm-inline-large-images-proportion
  306. @vindex mm-inline-images-max-proportion
  307. The proportion used when resizing large images.
  308. @item mm-inline-override-types
  309. @vindex mm-inline-override-types
  310. @code{mm-inlined-types} may include regular expressions, for example to
  311. specify that all @samp{text/.*} parts be displayed inline. If a user
  312. prefers to have a type that matches such a regular expression be treated
  313. as an attachment, that can be accomplished by setting this variable to a
  314. list containing that type. For example assuming @code{mm-inlined-types}
  315. includes @samp{text/.*}, then including @samp{text/html} in this
  316. variable will cause @samp{text/html} parts to be treated as attachments.
  317. @item mm-text-html-renderer
  318. @vindex mm-text-html-renderer
  319. This selects the function used to render @acronym{HTML}. The predefined
  320. renderers are selected by the symbols @code{shr}, @code{gnus-w3m},
  321. @code{w3m}@footnote{See @uref{http://emacs-w3m.namazu.org/} for more
  322. information about emacs-w3m}, @code{links}, @code{lynx},
  323. @code{w3m-standalone} or @code{html2text}. If @code{nil} use an
  324. external viewer. You can also specify a function, which will be
  325. called with a @acronym{MIME} handle as the argument.
  326. @item mm-html-inhibit-images
  327. @vindex mm-html-inhibit-images
  328. @vindex mm-inline-text-html-with-images
  329. If this is non-@code{nil}, inhibit displaying of images inline in the
  330. article body. It is effective to images in @acronym{HTML} articles
  331. rendered when @code{mm-text-html-renderer} (@pxref{Display
  332. Customization}) is @code{shr} or @code{w3m}. In Gnus, this is
  333. overridden by the value of @code{gnus-inhibit-images} (@pxref{Misc
  334. Article, ,Misc Article, gnus, Gnus manual}). The default is @code{nil}.
  335. @item mm-html-blocked-images
  336. @vindex mm-html-blocked-images
  337. External images that have @acronym{URL}s that match this regexp won't
  338. be fetched and displayed. For instance, to block all @acronym{URL}s
  339. that have the string ``ads'' in them, do the following:
  340. @lisp
  341. (setq mm-html-blocked-images "ads")
  342. @end lisp
  343. It is effective when @code{mm-text-html-renderer} (@pxref{Display
  344. Customization}) is @code{shr}. In Gnus, this is overridden by the value
  345. of @code{gnus-blocked-images} or the return value of the function that
  346. @code{gnus-blocked-images} is set to (@pxref{HTML, ,HTML, gnus, Gnus
  347. manual}).
  348. Some @acronym{HTML} mails might have the trick of spammers using
  349. @samp{<img>} tags. It is likely to be intended to verify whether you
  350. have read the mail. You can prevent your personal information from
  351. leaking by setting this option to @code{""} (which is the default).
  352. @item mm-w3m-safe-url-regexp
  353. @vindex mm-w3m-safe-url-regexp
  354. A regular expression that matches safe URL names, i.e., URLs that are
  355. unlikely to leak personal information when rendering @acronym{HTML}
  356. email (the default value is @samp{\\`cid:}). If @code{nil} consider
  357. all URLs safe. In Gnus, this will be overridden according to the value
  358. of the variable @code{gnus-safe-html-newsgroups}, @xref{Various
  359. Various, ,Various Various, gnus, Gnus Manual}.
  360. @item mm-inline-text-html-with-w3m-keymap
  361. @vindex mm-inline-text-html-with-w3m-keymap
  362. You can use emacs-w3m command keys in the inlined text/html part by
  363. setting this option to non-@code{nil}. The default value is @code{t}.
  364. @item mm-external-terminal-program
  365. @vindex mm-external-terminal-program
  366. The program used to start an external terminal.
  367. @item mm-enable-external
  368. @vindex mm-enable-external
  369. Indicate whether external @acronym{MIME} handlers should be used.
  370. If @code{t}, all defined external @acronym{MIME} handlers are used. If
  371. @code{nil}, files are saved to disk (@code{mailcap-save-binary-file}).
  372. If it is the symbol @code{ask}, you are prompted before the external
  373. @acronym{MIME} handler is invoked.
  374. When you launch an attachment through mailcap (@pxref{mailcap}) an
  375. attempt is made to use a safe viewer with the safest options---this isn't
  376. the case if you save it to disk and launch it in a different way
  377. (command line or double-clicking). Anyhow, if you want to be sure not
  378. to launch any external programs, set this variable to @code{nil} or
  379. @code{ask}.
  380. @end table
  381. @node Files and Directories
  382. @section Files and Directories
  383. @table @code
  384. @item mm-default-directory
  385. @vindex mm-default-directory
  386. The default directory for saving attachments. If @code{nil} use
  387. @code{default-directory}.
  388. @item mm-tmp-directory
  389. @vindex mm-tmp-directory
  390. Directory for storing temporary files.
  391. @item mm-file-name-rewrite-functions
  392. @vindex mm-file-name-rewrite-functions
  393. A list of functions used for rewriting file names of @acronym{MIME}
  394. parts. Each function is applied successively to the file name.
  395. Ready-made functions include
  396. @table @code
  397. @item mm-file-name-delete-control
  398. @findex mm-file-name-delete-control
  399. Delete all control characters.
  400. @item mm-file-name-delete-gotchas
  401. @findex mm-file-name-delete-gotchas
  402. Delete characters that could have unintended consequences when used
  403. with flawed shell scripts, i.e., @samp{|}, @samp{>} and @samp{<}; and
  404. @samp{-}, @samp{.} as the first character.
  405. @item mm-file-name-delete-whitespace
  406. @findex mm-file-name-delete-whitespace
  407. Remove all whitespace.
  408. @item mm-file-name-trim-whitespace
  409. @findex mm-file-name-trim-whitespace
  410. Remove leading and trailing whitespace.
  411. @item mm-file-name-collapse-whitespace
  412. @findex mm-file-name-collapse-whitespace
  413. Collapse multiple whitespace characters.
  414. @item mm-file-name-replace-whitespace
  415. @findex mm-file-name-replace-whitespace
  416. @vindex mm-file-name-replace-whitespace
  417. Replace whitespace with underscores. Set the variable
  418. @code{mm-file-name-replace-whitespace} to any other string if you do
  419. not like underscores.
  420. @end table
  421. The standard Emacs functions @code{capitalize}, @code{downcase},
  422. @code{upcase} and @code{upcase-initials} might also prove useful.
  423. @item mm-path-name-rewrite-functions
  424. @vindex mm-path-name-rewrite-functions
  425. List of functions used for rewriting the full file names of @acronym{MIME}
  426. parts. This is used when viewing parts externally, and is meant for
  427. transforming the absolute name so that non-compliant programs can find
  428. the file where it's saved.
  429. @end table
  430. @node New Viewers
  431. @section New Viewers
  432. Here's an example viewer for displaying @code{text/enriched} inline:
  433. @lisp
  434. (defun mm-display-enriched-inline (handle)
  435. (let (text)
  436. (with-temp-buffer
  437. (mm-insert-part handle)
  438. (save-window-excursion
  439. (enriched-decode (point-min) (point-max))
  440. (setq text (buffer-string))))
  441. (mm-insert-inline handle text)))
  442. @end lisp
  443. We see that the function takes a @acronym{MIME} handle as its parameter. It
  444. then goes to a temporary buffer, inserts the text of the part, does some
  445. work on the text, stores the result, goes back to the buffer it was
  446. called from and inserts the result.
  447. The two important helper functions here are @code{mm-insert-part} and
  448. @code{mm-insert-inline}. The first function inserts the text of the
  449. handle in the current buffer. It handles charset and/or content
  450. transfer decoding. The second function just inserts whatever text you
  451. tell it to insert, but it also sets things up so that the text can be
  452. ``undisplayed'' in a convenient manner.
  453. @node Composing
  454. @chapter Composing
  455. @cindex Composing
  456. @cindex MIME Composing
  457. @cindex MML
  458. @cindex MIME Meta Language
  459. Creating a @acronym{MIME} message is boring and non-trivial. Therefore,
  460. a library called @code{mml} has been defined that parses a language
  461. called @acronym{MML} (@acronym{MIME} Meta Language) and generates
  462. @acronym{MIME} messages.
  463. @findex mml-generate-mime
  464. The main interface function is @code{mml-generate-mime}. It will
  465. examine the contents of the current (narrowed-to) buffer and return a
  466. string containing the @acronym{MIME} message.
  467. @menu
  468. * Simple MML Example:: An example @acronym{MML} document.
  469. * MML Definition:: All valid @acronym{MML} elements.
  470. * Advanced MML Example:: Another example @acronym{MML} document.
  471. * Encoding Customization:: Variables that affect encoding.
  472. * Charset Translation:: How charsets are mapped from @sc{mule} to @acronym{MIME}.
  473. * Conversion:: Going from @acronym{MIME} to @acronym{MML} and vice versa.
  474. * Flowed text:: Soft and hard newlines.
  475. @end menu
  476. @node Simple MML Example
  477. @section Simple MML Example
  478. Here's a simple @samp{multipart/alternative}:
  479. @example
  480. <#multipart type=alternative>
  481. This is a plain text part.
  482. <#part type=text/enriched>
  483. <center>This is a centered enriched part</center>
  484. <#/multipart>
  485. @end example
  486. After running this through @code{mml-generate-mime}, we get this:
  487. @example
  488. Content-Type: multipart/alternative; boundary="=-=-="
  489. --=-=-=
  490. This is a plain text part.
  491. --=-=-=
  492. Content-Type: text/enriched
  493. <center>This is a centered enriched part</center>
  494. --=-=-=--
  495. @end example
  496. @node MML Definition
  497. @section MML Definition
  498. The @acronym{MML} language is very simple. It looks a bit like an SGML
  499. application, but it's not.
  500. The main concept of @acronym{MML} is the @dfn{part}. Each part can be of a
  501. different type or use a different charset. The way to delineate a part
  502. is with a @samp{<#part ...>} tag. Multipart parts can be introduced
  503. with the @samp{<#multipart ...>} tag. Parts are ended by the
  504. @samp{<#/part>} or @samp{<#/multipart>} tags. Parts started with the
  505. @samp{<#part ...>} tags are also closed by the next open tag.
  506. There's also the @samp{<#external ...>} tag. These introduce
  507. @samp{external/message-body} parts.
  508. Each tag can contain zero or more parameters on the form
  509. @samp{parameter=value}. The values may be enclosed in quotation marks,
  510. but that's not necessary unless the value contains white space. So
  511. @samp{filename=/home/user/#hello$^yes} is perfectly valid.
  512. If you want to talk about MML in a message, you need a way to
  513. ``quote'' these tags. The way to do that is to include an exclamation
  514. point after the opening two characters; i. e. @samp{<#!part ...>}.
  515. The following parameters have meaning in @acronym{MML}; parameters that have no
  516. meaning are ignored. The @acronym{MML} parameter names are the same as the
  517. @acronym{MIME} parameter names; the things in the parentheses say which
  518. header it will be used in.
  519. @table @samp
  520. @item type
  521. The @acronym{MIME} type of the part (@code{Content-Type}).
  522. @item filename
  523. Use the contents of the file in the body of the part
  524. (@code{Content-Disposition}).
  525. @item recipient-filename
  526. Use this as the file name in the generated @acronym{MIME} message for
  527. the recipient. That is, even if the file is called @file{foo.txt}
  528. locally, use this name instead in the @code{Content-Disposition} in
  529. the sent message.
  530. @item charset
  531. The contents of the body of the part are to be encoded in the character
  532. set specified (@code{Content-Type}). @xref{Charset Translation}.
  533. @item name
  534. Might be used to suggest a file name if the part is to be saved
  535. to a file (@code{Content-Type}).
  536. @item disposition
  537. Valid values are @samp{inline} and @samp{attachment}
  538. (@code{Content-Disposition}).
  539. @item encoding
  540. Valid values are @samp{7bit}, @samp{8bit}, @samp{quoted-printable} and
  541. @samp{base64} (@code{Content-Transfer-Encoding}). @xref{Charset
  542. Translation}.
  543. @item description
  544. A description of the part (@code{Content-Description}).
  545. @item creation-date
  546. RFC822 date when the part was created (@code{Content-Disposition}).
  547. @item modification-date
  548. RFC822 date when the part was modified (@code{Content-Disposition}).
  549. @item read-date
  550. RFC822 date when the part was read (@code{Content-Disposition}).
  551. @item recipients
  552. Who to encrypt/sign the part to. This field is used to override any
  553. auto-detection based on the To/CC headers.
  554. @item sender
  555. Identity used to sign the part. This field is used to override the
  556. default key used.
  557. @item size
  558. The size (in octets) of the part (@code{Content-Disposition}).
  559. @item sign
  560. What technology to sign this @acronym{MML} part with (@code{smime}, @code{pgp}
  561. or @code{pgpmime})
  562. @item encrypt
  563. What technology to encrypt this @acronym{MML} part with (@code{smime},
  564. @code{pgp} or @code{pgpmime})
  565. @end table
  566. Parameters for @samp{text/plain}:
  567. @table @samp
  568. @item format
  569. Formatting parameter for the text, valid values include @samp{fixed}
  570. (the default) and @samp{flowed}. Normally you do not specify this
  571. manually, since it requires the textual body to be formatted in a
  572. special way described in RFC 2646. @xref{Flowed text}.
  573. @end table
  574. Parameters for @samp{application/octet-stream}:
  575. @table @samp
  576. @item type
  577. Type of the part; informal---meant for human readers
  578. (@code{Content-Type}).
  579. @end table
  580. Parameters for @samp{message/external-body}:
  581. @table @samp
  582. @item access-type
  583. A word indicating the supported access mechanism by which the file may
  584. be obtained. Values include @samp{ftp}, @samp{anon-ftp}, @samp{tftp},
  585. @samp{localfile}, and @samp{mailserver}. (@code{Content-Type}.)
  586. @item expiration
  587. The RFC822 date after which the file may no longer be fetched.
  588. (@code{Content-Type}.)
  589. @item size
  590. The size (in octets) of the file. (@code{Content-Type}.)
  591. @item permission
  592. Valid values are @samp{read} and @samp{read-write}
  593. (@code{Content-Type}).
  594. @end table
  595. Parameters for @samp{sign=smime}:
  596. @table @samp
  597. @item keyfile
  598. File containing key and certificate for signer.
  599. @end table
  600. Parameters for @samp{encrypt=smime}:
  601. @table @samp
  602. @item certfile
  603. File containing certificate for recipient.
  604. @end table
  605. @node Advanced MML Example
  606. @section Advanced MML Example
  607. Here's a complex multipart message. It's a @samp{multipart/mixed} that
  608. contains many parts, one of which is a @samp{multipart/alternative}.
  609. @example
  610. <#multipart type=mixed>
  611. <#part type=image/jpeg filename=~/rms.jpg disposition=inline>
  612. <#multipart type=alternative>
  613. This is a plain text part.
  614. <#part type=text/enriched name=enriched.txt>
  615. <center>This is a centered enriched part</center>
  616. <#/multipart>
  617. This is a new plain text part.
  618. <#part disposition=attachment>
  619. This plain text part is an attachment.
  620. <#/multipart>
  621. @end example
  622. And this is the resulting @acronym{MIME} message:
  623. @example
  624. Content-Type: multipart/mixed; boundary="=-=-="
  625. --=-=-=
  626. --=-=-=
  627. Content-Type: image/jpeg;
  628. filename="~/rms.jpg"
  629. Content-Disposition: inline;
  630. filename="~/rms.jpg"
  631. Content-Transfer-Encoding: base64
  632. /9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRof
  633. Hh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/wAALCAAwADABAREA/8QAHwAA
  634. AQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQR
  635. BRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RF
  636. RkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ip
  637. qrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/9oACAEB
  638. AAA/AO/rifFHjldNuGsrDa0qcSSHkA+gHrXKw+LtWLrMb+RgTyhbr+HSug07xNqV9fQtZrNI
  639. AyiaE/NuBPOOOP0rvRNE880KOC8TbXXGCv1FPqjrF4LDR7u5L7SkTFT/ALWOP1xXgTuXfc7E
  640. sx6nua6rwp4IvvEM8chCxWxOdzn7wz6V9AaB4S07w9p5itow0rDLSY5Pt9K43xO66P4xs71m
  641. 2QXiGCbA4yOVJ9+1aYORkdK434lyNH4ahCnG66VT9Nj15JFbPdX0MS43M4VQf5/yr2vSpLnw
  642. 5ZW8dlCZ8KFXjOPX0/mK6rSPEGt3Angu44fNEReHYNvIH3TzXDeKNO8RX+kSX2ouZkicTIOc
  643. L+g7E810ulFjpVtv3bwgB3HJyK5L4quY/C9sVxk3ij/xx6850u7t1mtp/wDlpEw3An3Jr3Dw
  644. 34gsbWza4nBlhC5LDsaW6+IFgupQyCF3iHH7gA7c9R9ay7zx6t7aX9jHC4smhfBkGCvHGfrm
  645. tLQ7hbnRrV1GPkAP1x1/Hr+Ncr8Vzjwrbf8AX6v/AKA9eQRyYlQk8Yx9K6XTNbkgia2ciSIn
  646. 7p5Ga9Atte0LTLKO6it4i7dVRFJDcZ4PvXN+JvEMF9bILVGXJLSZ4zkjivRPDaeX4b08HOTC
  647. pOffmua+KkbS+GLVUGT9tT/0B68eeIpIFYjB70+OOVXyoOM9+M1eaWeCLzHPyHGO/NVWvJJm
  648. jQ8KGH1NfQWhXSXmh2c8eArRLwO3HSv/2Q==
  649. --=-=-=
  650. Content-Type: multipart/alternative; boundary="==-=-="
  651. --==-=-=
  652. This is a plain text part.
  653. --==-=-=
  654. Content-Type: text/enriched;
  655. name="enriched.txt"
  656. <center>This is a centered enriched part</center>
  657. --==-=-=--
  658. --=-=-=
  659. This is a new plain text part.
  660. --=-=-=
  661. Content-Disposition: attachment
  662. This plain text part is an attachment.
  663. --=-=-=--
  664. @end example
  665. @node Encoding Customization
  666. @section Encoding Customization
  667. @table @code
  668. @item mm-body-charset-encoding-alist
  669. @vindex mm-body-charset-encoding-alist
  670. Mapping from @acronym{MIME} charset to encoding to use. This variable is
  671. usually used except, e.g., when other requirements force a specific
  672. encoding (digitally signed messages require 7bit encodings). The
  673. default is
  674. @lisp
  675. ((iso-2022-jp . 7bit)
  676. (iso-2022-jp-2 . 7bit)
  677. (utf-16 . base64)
  678. (utf-16be . base64)
  679. (utf-16le . base64))
  680. @end lisp
  681. As an example, if you do not want to have ISO-8859-1 characters
  682. quoted-printable encoded, you may add @code{(iso-8859-1 . 8bit)} to
  683. this variable. You can override this setting on a per-message basis
  684. by using the @code{encoding} @acronym{MML} tag (@pxref{MML Definition}).
  685. @item mm-coding-system-priorities
  686. @vindex mm-coding-system-priorities
  687. Prioritize coding systems to use for outgoing messages. The default
  688. is @code{nil}, which means to use the defaults in Emacs, but is
  689. @code{(iso-8859-1 iso-2022-jp utf-8)} when running Emacs in the Japanese
  690. language environment. It is a list of coding system symbols (aliases of
  691. coding systems are also allowed, use @kbd{M-x describe-coding-system} to
  692. make sure you are specifying correct coding system names). For example,
  693. if you have configured Emacs to prefer UTF-8, but wish that outgoing
  694. messages should be sent in ISO-8859-1 if possible, you can set this
  695. variable to @code{(iso-8859-1)}. You can override this setting on a
  696. per-message basis by using the @code{charset} @acronym{MML} tag
  697. (@pxref{MML Definition}).
  698. As different hierarchies prefer different charsets, you may want to set
  699. @code{mm-coding-system-priorities} according to the hierarchy in Gnus.
  700. Here's an example:
  701. @c Corrections about preferred charsets are welcome. de, fr and fj
  702. @c should be correct, I don't know about the rest (so these are only
  703. @c examples):
  704. @lisp
  705. (add-to-list 'gnus-newsgroup-variables 'mm-coding-system-priorities)
  706. (setq gnus-parameters
  707. (nconc
  708. ;; Some charsets are just examples!
  709. '(("^cn\\." ;; Chinese
  710. (mm-coding-system-priorities
  711. '(iso-8859-1 cn-big5 chinese-iso-7bit utf-8)))
  712. ("^cz\\.\\|^pl\\." ;; Central and Eastern European
  713. (mm-coding-system-priorities '(iso-8859-2 utf-8)))
  714. ("^de\\." ;; German language
  715. (mm-coding-system-priorities '(iso-8859-1 iso-8859-15 utf-8)))
  716. ("^fr\\." ;; French
  717. (mm-coding-system-priorities '(iso-8859-15 iso-8859-1 utf-8)))
  718. ("^fj\\." ;; Japanese
  719. (mm-coding-system-priorities
  720. '(iso-8859-1 iso-2022-jp utf-8)))
  721. ("^ru\\." ;; Cyrillic
  722. (mm-coding-system-priorities
  723. '(koi8-r iso-8859-5 iso-8859-1 utf-8))))
  724. gnus-parameters))
  725. @end lisp
  726. @item mm-content-transfer-encoding-defaults
  727. @vindex mm-content-transfer-encoding-defaults
  728. Mapping from @acronym{MIME} types to encoding to use. This variable is usually
  729. used except, e.g., when other requirements force a safer encoding
  730. (digitally signed messages require 7bit encoding). Besides the normal
  731. @acronym{MIME} encodings, @code{qp-or-base64} may be used to indicate that for
  732. each case the most efficient of quoted-printable and base64 should be
  733. used.
  734. @code{qp-or-base64} has another effect. It will fold long lines so that
  735. MIME parts may not be broken by MTA@. So do @code{quoted-printable} and
  736. @code{base64}.
  737. Note that it affects body encoding only when a part is a raw forwarded
  738. message (which will be made by @code{gnus-summary-mail-forward} with the
  739. arg 2 for example) or is neither the @samp{text/*} type nor the
  740. @samp{message/*} type. Even though in those cases, you can override
  741. this setting on a per-message basis by using the @code{encoding}
  742. @acronym{MML} tag (@pxref{MML Definition}).
  743. @item mm-use-ultra-safe-encoding
  744. @vindex mm-use-ultra-safe-encoding
  745. When this is non-@code{nil}, it means that textual parts are encoded as
  746. quoted-printable if they contain lines longer than 76 characters or
  747. starting with "From " in the body. Non-7bit encodings (8bit, binary)
  748. are generally disallowed. This reduce the probability that a non-8bit
  749. clean MTA or MDA changes the message. This should never be set
  750. directly, but bound by other functions when necessary (e.g., when
  751. encoding messages that are to be digitally signed).
  752. @end table
  753. @node Charset Translation
  754. @section Charset Translation
  755. @cindex charsets
  756. During translation from @acronym{MML} to @acronym{MIME}, for each
  757. @acronym{MIME} part which has been composed inside Emacs, an appropriate
  758. charset has to be chosen.
  759. @vindex mail-parse-charset
  760. If you are running a non-@sc{mule} Emacs, this process is simple: If the
  761. part contains any non-@acronym{ASCII} (8-bit) characters, the @acronym{MIME} charset
  762. given by @code{mail-parse-charset} (a symbol) is used. (Never set this
  763. variable directly, though. If you want to change the default charset,
  764. please consult the documentation of the package which you use to process
  765. @acronym{MIME} messages.
  766. @xref{Various Message Variables, , Various Message Variables, message,
  767. Message Manual}, for example.)
  768. If there are only @acronym{ASCII} characters, the @acronym{MIME} charset US-ASCII is
  769. used, of course.
  770. @cindex MULE
  771. @cindex UTF-8
  772. @cindex Unicode
  773. @vindex mm-mime-mule-charset-alist
  774. Things are slightly more complicated when running Emacs with @sc{mule}
  775. support. In this case, a list of the @sc{mule} charsets used in the
  776. part is obtained, and the @sc{mule} charsets are translated to
  777. @acronym{MIME} charsets by consulting the table provided by Emacs itself
  778. or the variable @code{mm-mime-mule-charset-alist} for XEmacs.
  779. If this results in a single @acronym{MIME} charset, this is used to encode
  780. the part. But if the resulting list of @acronym{MIME} charsets contains more
  781. than one element, two things can happen: If it is possible to encode the
  782. part via UTF-8, this charset is used. (For this, Emacs must support
  783. the @code{utf-8} coding system, and the part must consist entirely of
  784. characters which have Unicode counterparts.) If UTF-8 is not available
  785. for some reason, the part is split into several ones, so that each one
  786. can be encoded with a single @acronym{MIME} charset. The part can only be
  787. split at line boundaries, though---if more than one @acronym{MIME} charset is
  788. required to encode a single line, it is not possible to encode the part.
  789. When running Emacs with @sc{mule} support, the preferences for which
  790. coding system to use is inherited from Emacs itself. This means that
  791. if Emacs is set up to prefer UTF-8, it will be used when encoding
  792. messages. You can modify this by altering the
  793. @code{mm-coding-system-priorities} variable though (@pxref{Encoding
  794. Customization}).
  795. The charset to be used can be overridden by setting the @code{charset}
  796. @acronym{MML} tag (@pxref{MML Definition}) when composing the message.
  797. The encoding of characters (quoted-printable, 8bit, etc.)@: is orthogonal
  798. to the discussion here, and is controlled by the variables
  799. @code{mm-body-charset-encoding-alist} and
  800. @code{mm-content-transfer-encoding-defaults} (@pxref{Encoding
  801. Customization}).
  802. @node Conversion
  803. @section Conversion
  804. @findex mime-to-mml
  805. A (multipart) @acronym{MIME} message can be converted to @acronym{MML}
  806. with the @code{mime-to-mml} function. It works on the message in the
  807. current buffer, and substitutes @acronym{MML} markup for @acronym{MIME}
  808. boundaries. Non-textual parts do not have their contents in the buffer,
  809. but instead have the contents in separate buffers that are referred to
  810. from the @acronym{MML} tags.
  811. @findex mml-to-mime
  812. An @acronym{MML} message can be converted back to @acronym{MIME} by the
  813. @code{mml-to-mime} function.
  814. These functions are in certain senses ``lossy''---you will not get back
  815. an identical message if you run @code{mime-to-mml} and then
  816. @code{mml-to-mime}. Not only will trivial things like the order of the
  817. headers differ, but the contents of the headers may also be different.
  818. For instance, the original message may use base64 encoding on text,
  819. while @code{mml-to-mime} may decide to use quoted-printable encoding, and
  820. so on.
  821. In essence, however, these two functions should be the inverse of each
  822. other. The resulting contents of the message should remain equivalent,
  823. if not identical.
  824. @node Flowed text
  825. @section Flowed text
  826. @cindex format=flowed
  827. The Emacs @acronym{MIME} library will respect the @code{use-hard-newlines}
  828. variable (@pxref{Hard and Soft Newlines, ,Hard and Soft Newlines,
  829. emacs, Emacs Manual}) when encoding a message, and the
  830. ``format=flowed'' Content-Type parameter when decoding a message.
  831. On encoding text, regardless of @code{use-hard-newlines}, lines
  832. terminated by soft newline characters are filled together and wrapped
  833. after the column decided by @code{fill-flowed-encode-column}.
  834. Quotation marks (matching @samp{^>* ?}) are respected. The variable
  835. controls how the text will look in a client that does not support
  836. flowed text, the default is to wrap after 66 characters. If hard
  837. newline characters are not present in the buffer, no flow encoding
  838. occurs.
  839. You can customize the value of the @code{mml-enable-flowed} variable
  840. to enable or disable the flowed encoding usage when newline
  841. characters are present in the buffer.
  842. On decoding flowed text, lines with soft newline characters are filled
  843. together and wrapped after the column decided by
  844. @code{fill-flowed-display-column}. The default is to wrap after
  845. @code{fill-column}.
  846. @table @code
  847. @item mm-fill-flowed
  848. @vindex mm-fill-flowed
  849. If non-@code{nil} a format=flowed article will be displayed flowed.
  850. @end table
  851. @node Interface Functions
  852. @chapter Interface Functions
  853. @cindex interface functions
  854. @cindex mail-parse
  855. The @code{mail-parse} library is an abstraction over the actual
  856. low-level libraries that are described in the next chapter.
  857. Standards change, and so programs have to change to fit in the new
  858. mold. For instance, RFC2045 describes a syntax for the
  859. @code{Content-Type} header that only allows @acronym{ASCII} characters in the
  860. parameter list. RFC2231 expands on RFC2045 syntax to provide a scheme
  861. for continuation headers and non-@acronym{ASCII} characters.
  862. The traditional way to deal with this is just to update the library
  863. functions to parse the new syntax. However, this is sometimes the wrong
  864. thing to do. In some instances it may be vital to be able to understand
  865. both the old syntax as well as the new syntax, and if there is only one
  866. library, one must choose between the old version of the library and the
  867. new version of the library.
  868. The Emacs @acronym{MIME} library takes a different tack. It defines a
  869. series of low-level libraries (@file{rfc2047.el}, @file{rfc2231.el}
  870. and so on) that parses strictly according to the corresponding
  871. standard. However, normal programs would not use the functions
  872. provided by these libraries directly, but instead use the functions
  873. provided by the @code{mail-parse} library. The functions in this
  874. library are just aliases to the corresponding functions in the latest
  875. low-level libraries. Using this scheme, programs get a consistent
  876. interface they can use, and library developers are free to create
  877. write code that handles new standards.
  878. The following functions are defined by this library:
  879. @table @code
  880. @item mail-header-parse-content-type
  881. @findex mail-header-parse-content-type
  882. Parse a @code{Content-Type} header and return a list on the following
  883. format:
  884. @lisp
  885. ("type/subtype"
  886. (attribute1 . value1)
  887. (attribute2 . value2)
  888. ...)
  889. @end lisp
  890. Here's an example:
  891. @example
  892. (mail-header-parse-content-type
  893. "image/gif; name=\"b980912.gif\"")
  894. @result{} ("image/gif" (name . "b980912.gif"))
  895. @end example
  896. @item mail-header-parse-content-disposition
  897. @findex mail-header-parse-content-disposition
  898. Parse a @code{Content-Disposition} header and return a list on the same
  899. format as the function above.
  900. @item mail-content-type-get
  901. @findex mail-content-type-get
  902. Takes two parameters---a list on the format above, and an attribute.
  903. Returns the value of the attribute.
  904. @example
  905. (mail-content-type-get
  906. '("image/gif" (name . "b980912.gif")) 'name)
  907. @result{} "b980912.gif"
  908. @end example
  909. @item mail-header-encode-parameter
  910. @findex mail-header-encode-parameter
  911. Takes a parameter string and returns an encoded version of the string.
  912. This is used for parameters in headers like @code{Content-Type} and
  913. @code{Content-Disposition}.
  914. @item mail-header-remove-comments
  915. @findex mail-header-remove-comments
  916. Return a comment-free version of a header.
  917. @example
  918. (mail-header-remove-comments
  919. "Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)")
  920. @result{} "Gnus/5.070027 "
  921. @end example
  922. @item mail-header-remove-whitespace
  923. @findex mail-header-remove-whitespace
  924. Remove linear white space from a header. Space inside quoted strings
  925. and comments is preserved.
  926. @example
  927. (mail-header-remove-whitespace
  928. "image/gif; name=\"Name with spaces\"")
  929. @result{} "image/gif;name=\"Name with spaces\""
  930. @end example
  931. @item mail-header-get-comment
  932. @findex mail-header-get-comment
  933. Return the last comment in a header.
  934. @example
  935. (mail-header-get-comment
  936. "Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)")
  937. @result{} "Finnish Landrace"
  938. @end example
  939. @item mail-header-parse-address
  940. @findex mail-header-parse-address
  941. Parse an address and return a list containing the mailbox and the
  942. plaintext name.
  943. @example
  944. (mail-header-parse-address
  945. "Hrvoje Niksic <hniksic@@srce.hr>")
  946. @result{} ("hniksic@@srce.hr" . "Hrvoje Niksic")
  947. @end example
  948. @item mail-header-parse-addresses
  949. @findex mail-header-parse-addresses
  950. Parse a string with list of addresses and return a list of elements like
  951. the one described above.
  952. @example
  953. (mail-header-parse-addresses
  954. "Hrvoje Niksic <hniksic@@srce.hr>, Steinar Bang <sb@@metis.no>")
  955. @result{} (("hniksic@@srce.hr" . "Hrvoje Niksic")
  956. ("sb@@metis.no" . "Steinar Bang"))
  957. @end example
  958. @item mail-header-parse-date
  959. @findex mail-header-parse-date
  960. Parse a date string and return an Emacs time structure.
  961. @item mail-narrow-to-head
  962. @findex mail-narrow-to-head
  963. Narrow the buffer to the header section of the buffer. Point is placed
  964. at the beginning of the narrowed buffer.
  965. @item mail-header-narrow-to-field
  966. @findex mail-header-narrow-to-field
  967. Narrow the buffer to the header under point. Understands continuation
  968. headers.
  969. @item mail-header-fold-field
  970. @findex mail-header-fold-field
  971. Fold the header under point.
  972. @item mail-header-unfold-field
  973. @findex mail-header-unfold-field
  974. Unfold the header under point.
  975. @item mail-header-field-value
  976. @findex mail-header-field-value
  977. Return the value of the field under point.
  978. @item mail-encode-encoded-word-region
  979. @findex mail-encode-encoded-word-region
  980. Encode the non-@acronym{ASCII} words in the region. For instance,
  981. @samp{Naïve} is encoded as @samp{=?iso-8859-1?q?Na=EFve?=}.
  982. @item mail-encode-encoded-word-buffer
  983. @findex mail-encode-encoded-word-buffer
  984. Encode the non-@acronym{ASCII} words in the current buffer. This function is
  985. meant to be called narrowed to the headers of a message.
  986. @item mail-encode-encoded-word-string
  987. @findex mail-encode-encoded-word-string
  988. Encode the words that need encoding in a string, and return the result.
  989. @example
  990. (mail-encode-encoded-word-string
  991. "This is naïve, baby")
  992. @result{} "This is =?iso-8859-1?q?na=EFve,?= baby"
  993. @end example
  994. @item mail-decode-encoded-word-region
  995. @findex mail-decode-encoded-word-region
  996. Decode the encoded words in the region.
  997. @item mail-decode-encoded-word-string
  998. @findex mail-decode-encoded-word-string
  999. Decode the encoded words in the string and return the result.
  1000. @example
  1001. (mail-decode-encoded-word-string
  1002. "This is =?iso-8859-1?q?na=EFve,?= baby")
  1003. @result{} "This is naïve, baby"
  1004. @end example
  1005. @end table
  1006. Currently, @code{mail-parse} is an abstraction over @code{ietf-drums},
  1007. @code{rfc2047}, @code{rfc2045} and @code{rfc2231}. These are documented
  1008. in the subsequent sections.
  1009. @node Basic Functions
  1010. @chapter Basic Functions
  1011. This chapter describes the basic, ground-level functions for parsing and
  1012. handling. Covered here is parsing @code{From} lines, removing comments
  1013. from header lines, decoding encoded words, parsing date headers and so
  1014. on. High-level functionality is dealt with in the first chapter
  1015. (@pxref{Decoding and Viewing}).
  1016. @menu
  1017. * rfc2045:: Encoding @code{Content-Type} headers.
  1018. * rfc2231:: Parsing @code{Content-Type} headers.
  1019. * ietf-drums:: Handling mail headers defined by RFC822bis.
  1020. * rfc2047:: En/decoding encoded words in headers.
  1021. * time-date:: Functions for parsing dates and manipulating time.
  1022. * qp:: Quoted-Printable en/decoding.
  1023. * base64:: Base64 en/decoding.
  1024. * binhex:: Binhex decoding.
  1025. * uudecode:: Uuencode decoding.
  1026. * yenc:: Yenc decoding.
  1027. * rfc1843:: Decoding HZ-encoded text.
  1028. * mailcap:: How parts are displayed is specified by the @file{.mailcap} file
  1029. @end menu
  1030. @node rfc2045
  1031. @section rfc2045
  1032. RFC2045 is the ``main'' @acronym{MIME} document, and as such, one would
  1033. imagine that there would be a lot to implement. But there isn't, since
  1034. most of the implementation details are delegated to the subsequent
  1035. RFCs.
  1036. So @file{rfc2045.el} has only a single function:
  1037. @table @code
  1038. @item rfc2045-encode-string
  1039. @findex rfc2045-encode-string
  1040. Takes a parameter and a value and returns a @samp{PARAM=VALUE} string.
  1041. @var{value} will be quoted if there are non-safe characters in it.
  1042. @end table
  1043. @node rfc2231
  1044. @section rfc2231
  1045. RFC2231 defines a syntax for the @code{Content-Type} and
  1046. @code{Content-Disposition} headers. Its snappy name is @dfn{MIME
  1047. Parameter Value and Encoded Word Extensions: Character Sets, Languages,
  1048. and Continuations}.
  1049. In short, these headers look something like this:
  1050. @example
  1051. Content-Type: application/x-stuff;
  1052. title*0*=us-ascii'en'This%20is%20even%20more%20;
  1053. title*1*=%2A%2A%2Afun%2A%2A%2A%20;
  1054. title*2="isn't it!"
  1055. @end example
  1056. They usually aren't this bad, though.
  1057. The following functions are defined by this library:
  1058. @table @code
  1059. @item rfc2231-parse-string
  1060. @findex rfc2231-parse-string
  1061. Parse a @code{Content-Type} header and return a list describing its
  1062. elements.
  1063. @example
  1064. (rfc2231-parse-string
  1065. "application/x-stuff;
  1066. title*0*=us-ascii'en'This%20is%20even%20more%20;
  1067. title*1*=%2A%2A%2Afun%2A%2A%2A%20;
  1068. title*2=\"isn't it!\"")
  1069. @result{} ("application/x-stuff"
  1070. (title . "This is even more ***fun*** isn't it!"))
  1071. @end example
  1072. @item rfc2231-get-value
  1073. @findex rfc2231-get-value
  1074. Takes one of the lists on the format above and returns
  1075. the value of the specified attribute.
  1076. @item rfc2231-encode-string
  1077. @findex rfc2231-encode-string
  1078. Encode a parameter in headers likes @code{Content-Type} and
  1079. @code{Content-Disposition}.
  1080. @end table
  1081. @node ietf-drums
  1082. @section ietf-drums
  1083. @dfn{drums} is an IETF working group that is working on the replacement
  1084. for RFC822.
  1085. The functions provided by this library include:
  1086. @table @code
  1087. @item ietf-drums-remove-comments
  1088. @findex ietf-drums-remove-comments
  1089. Remove the comments from the argument and return the results.
  1090. @item ietf-drums-remove-whitespace
  1091. @findex ietf-drums-remove-whitespace
  1092. Remove linear white space from the string and return the results.
  1093. Spaces inside quoted strings and comments are left untouched.
  1094. @item ietf-drums-get-comment
  1095. @findex ietf-drums-get-comment
  1096. Return the last most comment from the string.
  1097. @item ietf-drums-parse-address
  1098. @findex ietf-drums-parse-address
  1099. Parse an address string and return a list that contains the mailbox and
  1100. the plain text name.
  1101. @item ietf-drums-parse-addresses
  1102. @findex ietf-drums-parse-addresses
  1103. Parse a string that contains any number of comma-separated addresses and
  1104. return a list that contains mailbox/plain text pairs.
  1105. @item ietf-drums-parse-date
  1106. @findex ietf-drums-parse-date
  1107. Parse a date string and return an Emacs time structure.
  1108. @item ietf-drums-narrow-to-header
  1109. @findex ietf-drums-narrow-to-header
  1110. Narrow the buffer to the header section of the current buffer.
  1111. @end table
  1112. @node rfc2047
  1113. @section rfc2047
  1114. RFC2047 (Message Header Extensions for Non-@acronym{ASCII} Text) specifies how
  1115. non-@acronym{ASCII} text in headers are to be encoded. This is actually rather
  1116. complicated, so a number of variables are necessary to tweak what this
  1117. library does.
  1118. The following variables are tweakable:
  1119. @table @code
  1120. @item rfc2047-header-encoding-alist
  1121. @vindex rfc2047-header-encoding-alist
  1122. This is an alist of header / encoding-type pairs. Its main purpose is
  1123. to prevent encoding of certain headers.
  1124. The keys can either be header regexps, or @code{t}.
  1125. The values can be @code{nil}, in which case the header(s) in question
  1126. won't be encoded, @code{mime}, which means that they will be encoded, or
  1127. @code{address-mime}, which means the header(s) will be encoded carefully
  1128. assuming they contain addresses.
  1129. @item rfc2047-charset-encoding-alist
  1130. @vindex rfc2047-charset-encoding-alist
  1131. RFC2047 specifies two forms of encoding---@code{Q} (a
  1132. Quoted-Printable-like encoding) and @code{B} (base64). This alist
  1133. specifies which charset should use which encoding.
  1134. @item rfc2047-encode-function-alist
  1135. @vindex rfc2047-encode-function-alist
  1136. This is an alist of encoding / function pairs. The encodings are
  1137. @code{Q}, @code{B} and @code{nil}.
  1138. @item rfc2047-encoded-word-regexp
  1139. @vindex rfc2047-encoded-word-regexp
  1140. When decoding words, this library looks for matches to this regexp.
  1141. @item rfc2047-encoded-word-regexp-loose
  1142. @vindex rfc2047-encoded-word-regexp-loose
  1143. This is a version from which the regexp for the Q encoding pattern of
  1144. @code{rfc2047-encoded-word-regexp} is made loose.
  1145. @item rfc2047-encode-encoded-words
  1146. @vindex rfc2047-encode-encoded-words
  1147. The boolean variable specifies whether encoded words
  1148. (e.g., @samp{=?us-ascii?q?hello?=}) should be encoded again.
  1149. @code{rfc2047-encoded-word-regexp} is used to look for such words.
  1150. @item rfc2047-allow-irregular-q-encoded-words
  1151. @vindex rfc2047-allow-irregular-q-encoded-words
  1152. The boolean variable specifies whether irregular Q encoded words
  1153. (e.g., @samp{=?us-ascii?q?hello??=}) should be decoded. If it is
  1154. non-@code{nil}, @code{rfc2047-encoded-word-regexp-loose} is used instead
  1155. of @code{rfc2047-encoded-word-regexp} to look for encoded words.
  1156. @end table
  1157. Those were the variables, and these are this functions:
  1158. @table @code
  1159. @item rfc2047-narrow-to-field
  1160. @findex rfc2047-narrow-to-field
  1161. Narrow the buffer to the header on the current line.
  1162. @item rfc2047-encode-message-header
  1163. @findex rfc2047-encode-message-header
  1164. Should be called narrowed to the header of a message. Encodes according
  1165. to @code{rfc2047-header-encoding-alist}.
  1166. @item rfc2047-encode-region
  1167. @findex rfc2047-encode-region
  1168. Encodes all encodable words in the region specified.
  1169. @item rfc2047-encode-string
  1170. @findex rfc2047-encode-string
  1171. Encode a string and return the results.
  1172. @item rfc2047-decode-region
  1173. @findex rfc2047-decode-region
  1174. Decode the encoded words in the region.
  1175. @item rfc2047-decode-string
  1176. @findex rfc2047-decode-string
  1177. Decode a string and return the results.
  1178. @item rfc2047-encode-parameter
  1179. @findex rfc2047-encode-parameter
  1180. Encode a parameter in the RFC2047-like style. This is a substitution
  1181. for the @code{rfc2231-encode-string} function, that is the standard but
  1182. many mailers don't support it. @xref{rfc2231}.
  1183. @end table
  1184. @node time-date
  1185. @section time-date
  1186. While not really a part of the @acronym{MIME} library, it is convenient to
  1187. document this library here. It deals with parsing @code{Date} headers
  1188. and manipulating time. (Not by using tesseracts, though, I'm sorry to
  1189. say.)
  1190. These functions convert between five formats: A date string, an Emacs
  1191. time structure, a decoded time list, a second number, and a day number.
  1192. Here's a bunch of time/date/second/day examples:
  1193. @example
  1194. (parse-time-string "Sat Sep 12 12:21:54 1998 +0200")
  1195. @result{} (54 21 12 12 9 1998 6 nil 7200)
  1196. (date-to-time "Sat Sep 12 12:21:54 1998 +0200")
  1197. @result{} (13818 19266)
  1198. (parse-iso8601-time-string "1998-09-12T12:21:54+0200")
  1199. @result{} (13818 19266)
  1200. (float-time '(13818 19266))
  1201. @result{} 905595714.0
  1202. (seconds-to-time 905595714.0)
  1203. @result{} (13818 19266 0 0)
  1204. (time-to-days '(13818 19266))
  1205. @result{} 729644
  1206. (days-to-time 729644)
  1207. @result{} (961933 512)
  1208. (time-since '(13818 19266))
  1209. @result{} (6797 9607 984839 247000)
  1210. (time-less-p '(13818 19266) '(13818 19145))
  1211. @result{} nil
  1212. (time-subtract '(13818 19266) '(13818 19145))
  1213. @result{} (0 121)
  1214. (days-between "Sat Sep 12 12:21:54 1998 +0200"
  1215. "Sat Sep 07 12:21:54 1998 +0200")
  1216. @result{} 5
  1217. (date-leap-year-p 2000)
  1218. @result{} t
  1219. (time-to-day-in-year '(13818 19266))
  1220. @result{} 255
  1221. (time-to-number-of-days
  1222. (time-since
  1223. (date-to-time "Mon, 01 Jan 2001 02:22:26 GMT")))
  1224. @result{} 4314.095589286675
  1225. @end example
  1226. And finally, we have @code{safe-date-to-time}, which does the same as
  1227. @code{date-to-time}, but returns a zero time if the date is
  1228. syntactically malformed.
  1229. The five data representations used are the following:
  1230. @table @var
  1231. @item date
  1232. An RFC822 (or similar) date string. For instance: @code{"Sat Sep 12
  1233. 12:21:54 1998 +0200"}.
  1234. @item time
  1235. An internal Emacs time. For instance: @code{(13818 26466 0 0)}.
  1236. @item seconds
  1237. A floating point representation of the internal Emacs time. For
  1238. instance: @code{905595714.0}.
  1239. @item days
  1240. An integer number representing the number of days since 00000101. For
  1241. instance: @code{729644}.
  1242. @item decoded time
  1243. A list of decoded time. For instance: @code{(54 21 12 12 9 1998 6 t
  1244. 7200)}.
  1245. @end table
  1246. All the examples above represent the same moment.
  1247. These are the functions available:
  1248. @table @code
  1249. @item date-to-time
  1250. Take a date and return a time.
  1251. @item float-time
  1252. Take a time and return seconds. (This is a built-in function.)
  1253. @item seconds-to-time
  1254. Take seconds and return a time.
  1255. @item time-to-days
  1256. Take a time and return days.
  1257. @item days-to-time
  1258. Take days and return a time.
  1259. @item date-to-day
  1260. Take a date and return days.
  1261. @item time-to-number-of-days
  1262. Take a time and return the number of days that represents.
  1263. @item safe-date-to-time
  1264. Take a date and return a time. If the date is not syntactically valid,
  1265. return a ``zero'' time.
  1266. @item time-less-p
  1267. Take two times and say whether the first time is less (i.e., earlier)
  1268. than the second time. (This is a built-in function.)
  1269. @item time-since
  1270. Take a time and return a time saying how long it was since that time.
  1271. @item time-subtract
  1272. Take two times and subtract the second from the first. I.e., return
  1273. the time between the two times. (This is a built-in function.)
  1274. @item days-between
  1275. Take two days and return the number of days between those two days.
  1276. @item date-leap-year-p
  1277. Take a year number and say whether it's a leap year.
  1278. @item time-to-day-in-year
  1279. Take a time and return the day number within the year that the time is
  1280. in.
  1281. @end table
  1282. @node qp
  1283. @section qp
  1284. This library deals with decoding and encoding Quoted-Printable text.
  1285. Very briefly explained, qp encoding means translating all 8-bit
  1286. characters (and lots of control characters) into things that look like
  1287. @samp{=EF}; that is, an equal sign followed by the byte encoded as a hex
  1288. string.
  1289. The following functions are defined by the library:
  1290. @table @code
  1291. @item quoted-printable-decode-region
  1292. @findex quoted-printable-decode-region
  1293. QP-decode all the encoded text in the specified region.
  1294. @item quoted-printable-decode-string
  1295. @findex quoted-printable-decode-string
  1296. Decode the QP-encoded text in a string and return the results.
  1297. @item quoted-printable-encode-region
  1298. @findex quoted-printable-encode-region
  1299. QP-encode all the encodable characters in the specified region. The third
  1300. optional parameter @var{fold} specifies whether to fold long lines.
  1301. (Long here means 72.)
  1302. @item quoted-printable-encode-string
  1303. @findex quoted-printable-encode-string
  1304. QP-encode all the encodable characters in a string and return the
  1305. results.
  1306. @end table
  1307. @node base64
  1308. @section base64
  1309. @cindex base64
  1310. Base64 is an encoding that encodes three bytes into four characters,
  1311. thereby increasing the size by about 33%. The alphabet used for
  1312. encoding is very resistant to mangling during transit.
  1313. The following functions are defined by this library:
  1314. @table @code
  1315. @item base64-encode-region
  1316. @findex base64-encode-region
  1317. base64 encode the selected region. Return the length of the encoded
  1318. text. Optional third argument @var{no-line-break} means do not break
  1319. long lines into shorter lines.
  1320. @item base64-encode-string
  1321. @findex base64-encode-string
  1322. base64 encode a string and return the result.
  1323. @item base64-decode-region
  1324. @findex base64-decode-region
  1325. base64 decode the selected region. Return the length of the decoded
  1326. text. If the region can't be decoded, return @code{nil} and don't
  1327. modify the buffer.
  1328. @item base64-decode-string
  1329. @findex base64-decode-string
  1330. base64 decode a string and return the result. If the string can't be
  1331. decoded, @code{nil} is returned.
  1332. @end table
  1333. @node binhex
  1334. @section binhex
  1335. @cindex binhex
  1336. @cindex Apple
  1337. @cindex Macintosh
  1338. @code{binhex} is an encoding that originated in Macintosh environments.
  1339. The following function is supplied to deal with these:
  1340. @table @code
  1341. @item binhex-decode-region
  1342. @findex binhex-decode-region
  1343. Decode the encoded text in the region. If given a third parameter, only
  1344. decode the @code{binhex} header and return the filename.
  1345. @end table
  1346. @node uudecode
  1347. @section uudecode
  1348. @cindex uuencode
  1349. @cindex uudecode
  1350. @code{uuencode} is probably still the most popular encoding of binaries
  1351. used on Usenet, although @code{base64} rules the mail world.
  1352. The following function is supplied by this package:
  1353. @table @code
  1354. @item uudecode-decode-region
  1355. @findex uudecode-decode-region
  1356. Decode the text in the region.
  1357. @end table
  1358. @node yenc
  1359. @section yenc
  1360. @cindex yenc
  1361. @code{yenc} is used for encoding binaries on Usenet. The following
  1362. function is supplied by this package:
  1363. @table @code
  1364. @item yenc-decode-region
  1365. @findex yenc-decode-region
  1366. Decode the encoded text in the region.
  1367. @end table
  1368. @node rfc1843
  1369. @section rfc1843
  1370. @cindex rfc1843
  1371. @cindex HZ
  1372. @cindex Chinese
  1373. RFC1843 deals with mixing Chinese and @acronym{ASCII} characters in messages. In
  1374. essence, RFC1843 switches between @acronym{ASCII} and Chinese by doing this:
  1375. @example
  1376. This sentence is in @acronym{ASCII}.
  1377. The next sentence is in GB.~@{<:Ky2;S@{#,NpJ)l6HK!#~@}Bye.
  1378. @end example
  1379. Simple enough, and widely used in China.
  1380. The following functions are available to handle this encoding:
  1381. @table @code
  1382. @item rfc1843-decode-region
  1383. Decode HZ-encoded text in the region.
  1384. @item rfc1843-decode-string
  1385. Decode a HZ-encoded string and return the result.
  1386. @end table
  1387. @node mailcap
  1388. @section mailcap
  1389. The @file{~/.mailcap} file is parsed by most @acronym{MIME}-aware message
  1390. handlers and describes how elements are supposed to be displayed.
  1391. Here's an example file:
  1392. @example
  1393. image/*; gimp -8 %s
  1394. audio/wav; wavplayer %s
  1395. application/msword; catdoc %s ; copiousoutput ; nametemplate=%s.doc
  1396. @end example
  1397. This says that all image files should be displayed with @code{gimp},
  1398. that WAVE audio files should be played by @code{wavplayer}, and that
  1399. MS-WORD files should be inlined by @code{catdoc}.
  1400. The @code{mailcap} library parses this file, and provides functions for
  1401. matching types.
  1402. @table @code
  1403. @item mailcap-mime-data
  1404. @vindex mailcap-mime-data
  1405. This variable is an alist of alists containing backup viewing rules.
  1406. @item mailcap-user-mime-data
  1407. @vindex mailcap-user-mime-data
  1408. A customizable list of viewers that take preference over
  1409. @code{mailcap-mime-data}.
  1410. @end table
  1411. Interface functions:
  1412. @table @code
  1413. @item mailcap-parse-mailcaps
  1414. @findex mailcap-parse-mailcaps
  1415. Parse the @file{~/.mailcap} file.
  1416. @item mailcap-mime-info
  1417. Takes a @acronym{MIME} type as its argument and returns the matching viewer.
  1418. @end table
  1419. @node Standards
  1420. @chapter Standards
  1421. The Emacs @acronym{MIME} library implements handling of various elements
  1422. according to a (somewhat) large number of RFCs, drafts and standards
  1423. documents. This chapter lists the relevant ones. They can all be
  1424. fetched from @uref{http://quimby.gnus.org/notes/}.
  1425. @table @dfn
  1426. @item RFC822
  1427. @itemx STD11
  1428. Standard for the Format of ARPA Internet Text Messages.
  1429. @item RFC1036
  1430. Standard for Interchange of USENET Messages
  1431. @item RFC2045
  1432. Format of Internet Message Bodies
  1433. @item RFC2046
  1434. Media Types
  1435. @item RFC2047
  1436. Message Header Extensions for Non-@acronym{ASCII} Text
  1437. @item RFC2048
  1438. Registration Procedures
  1439. @item RFC2049
  1440. Conformance Criteria and Examples
  1441. @item RFC2231
  1442. @acronym{MIME} Parameter Value and Encoded Word Extensions: Character Sets,
  1443. Languages, and Continuations
  1444. @item RFC1843
  1445. HZ---A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and
  1446. @acronym{ASCII} characters
  1447. @item draft-ietf-drums-msg-fmt-05.txt
  1448. Draft for the successor of RFC822
  1449. @item RFC2112
  1450. The @acronym{MIME} Multipart/Related Content-type
  1451. @item RFC1892
  1452. The Multipart/Report Content Type for the Reporting of Mail System
  1453. Administrative Messages
  1454. @item RFC2183
  1455. Communicating Presentation Information in Internet Messages: The
  1456. Content-Disposition Header Field
  1457. @item RFC2646
  1458. Documentation of the text/plain format parameter for flowed text.
  1459. @end table
  1460. @node GNU Free Documentation License
  1461. @chapter GNU Free Documentation License
  1462. @include doclicense.texi
  1463. @node Index
  1464. @chapter Index
  1465. @printindex cp
  1466. @bye
  1467. @c Local Variables:
  1468. @c mode: texinfo
  1469. @c coding: utf-8
  1470. @c End: