emacs-mime.texi 59 KB


  1. \input texinfo
  2. @include gnus-overrides.texi
  3. @setfilename ../../info/emacs-mime.info
  4. @settitle Emacs MIME Manual
  5. @include docstyle.texi
  6. @synindex fn cp
  7. @synindex vr cp
  8. @synindex pg cp
  9. @copying
  10. This file documents the Emacs MIME interface functionality.
  11. Copyright @copyright{} 1998--2016 Free Software Foundation, Inc.
  12. @quotation
  13. Permission is granted to copy, distribute and/or modify this document
  14. under the terms of the GNU Free Documentation License, Version 1.3 or
  15. any later version published by the Free Software Foundation; with no
  16. Invariant Sections, with the Front-Cover Texts being ``A GNU Manual'',
  17. and with the Back-Cover Texts as in (a) below. A copy of the license
  18. is included in the section entitled ``GNU Free Documentation License''.
  19. (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
  20. modify this GNU manual.''
  21. @end quotation
  22. @end copying
  23. @c Node ``Interface Functions'' uses non-ASCII characters
  24. @dircategory Emacs lisp libraries
  25. @direntry
  26. * Emacs MIME: (emacs-mime). Emacs MIME de/composition library.
  27. @end direntry
  28. @iftex
  29. @finalout
  30. @end iftex
  31. @setchapternewpage odd
  32. @titlepage
  33. @ifset WEBHACKDEVEL
  34. @title Emacs MIME Manual (DEVELOPMENT VERSION)
  35. @end ifset
  36. @ifclear WEBHACKDEVEL
  37. @title Emacs MIME Manual
  38. @end ifclear
  39. @author by Lars Magne Ingebrigtsen
  40. @page
  41. @vskip 0pt plus 1filll
  42. @insertcopying
  43. @end titlepage
  44. @contents
  45. @node Top
  46. @top Emacs MIME
  47. This manual documents the libraries used to compose and display
  48. @acronym{MIME} messages.
  49. This manual is directed at users who want to modify the behavior of
  50. the @acronym{MIME} encoding/decoding process or want a more detailed
  51. picture of how the Emacs @acronym{MIME} library works, and people who want
  52. to write functions and commands that manipulate @acronym{MIME} elements.
  53. @acronym{MIME} is short for @dfn{Multipurpose Internet Mail Extensions}.
  54. This standard is documented in a number of RFCs; mainly RFC2045 (Format
  55. of Internet Message Bodies), RFC2046 (Media Types), RFC2047 (Message
  56. Header Extensions for Non-@acronym{ASCII} Text), RFC2048 (Registration
  57. Procedures), RFC2049 (Conformance Criteria and Examples). It is highly
  58. recommended that anyone who intends writing @acronym{MIME}-compliant software
  59. read at least RFC2045 and RFC2047.
  60. @ifnottex
  61. @insertcopying
  62. @end ifnottex
  63. @menu
  64. * Decoding and Viewing:: A framework for decoding and viewing.
  65. * Composing:: @acronym{MML}; a language for describing @acronym{MIME} parts.
  66. * Interface Functions:: An abstraction over the basic functions.
  67. * Basic Functions:: Utility and basic parsing functions.
  68. * Standards:: A summary of RFCs and working documents used.
  69. * GNU Free Documentation License:: The license for this documentation.
  70. * Index:: Function and variable index.
  71. @end menu
  72. @node Decoding and Viewing
  73. @chapter Decoding and Viewing
  74. This chapter deals with decoding and viewing @acronym{MIME} messages on a
  75. higher level.
  76. The main idea is to first analyze a @acronym{MIME} article, and then allow
  77. other programs to do things based on the list of @dfn{handles} that are
  78. returned as a result of this analysis.
  79. @menu
  80. * Dissection:: Analyzing a @acronym{MIME} message.
  81. * Non-MIME:: Analyzing a non-@acronym{MIME} message.
  82. * Handles:: Handle manipulations.
  83. * Display:: Displaying handles.
  84. * Display Customization:: Variables that affect display.
  85. * Files and Directories:: Saving and naming attachments.
  86. * New Viewers:: How to write your own viewers.
  87. @end menu
  88. @node Dissection
  89. @section Dissection
  90. The @code{mm-dissect-buffer} is the function responsible for dissecting
  91. a @acronym{MIME} article. If given a multipart message, it will recursively
  92. descend the message, following the structure, and return a tree of
  93. @acronym{MIME} handles that describes the structure of the message.
  94. @node Non-MIME
  95. @section Non-MIME
  96. @vindex mm-uu-configure-list
  97. Gnus also understands some non-@acronym{MIME} attachments, such as
  98. postscript, uuencode, binhex, yenc, shar, forward, gnatsweb, pgp,
  99. diff. Each of these features can be disabled by add an item into
  100. @code{mm-uu-configure-list}. For example,
  101. @lisp
  102. (require 'mm-uu)
  103. (add-to-list 'mm-uu-configure-list '(pgp-signed . disabled))
  104. @end lisp
  105. @table @code
  106. @item postscript
  107. @findex postscript
  108. PostScript file.
  109. @item uu
  110. @findex uu
  111. Uuencoded file.
  112. @item binhex
  113. @findex binhex
  114. Binhex encoded file.
  115. @item yenc
  116. @findex yenc
  117. Yenc encoded file.
  118. @item shar
  119. @findex shar
  120. Shar archive file.
  121. @item forward
  122. @findex forward
  123. Non-@acronym{MIME} forwarded message.
  124. @item gnatsweb
  125. @findex gnatsweb
  126. Gnatsweb attachment.
  127. @item pgp-signed
  128. @findex pgp-signed
  129. @acronym{PGP} signed clear text.
  130. @item pgp-encrypted
  131. @findex pgp-encrypted
  132. @acronym{PGP} encrypted clear text.
  133. @item pgp-key
  134. @findex pgp-key
  135. @acronym{PGP} public keys.
  136. @item emacs-sources
  137. @findex emacs-sources
  138. @vindex mm-uu-emacs-sources-regexp
  139. Emacs source code. This item works only in the groups matching
  140. @code{mm-uu-emacs-sources-regexp}.
  141. @item diff
  142. @vindex diff
  143. @vindex mm-uu-diff-groups-regexp
  144. Patches. This is intended for groups where diffs of committed files
  145. are automatically sent to. It only works in groups matching
  146. @code{mm-uu-diff-groups-regexp}.
  147. @item verbatim-marks
  148. @cindex verbatim-marks
  149. Slrn-style verbatim marks.
  150. @item LaTeX
  151. @cindex LaTeX
  152. LaTeX documents. It only works in groups matching
  153. @code{mm-uu-tex-groups-regexp}.
  154. @end table
  155. @cindex text/x-verbatim
  156. @c Is @vindex suitable for a face?
  157. @vindex mm-uu-extract
  158. Some inlined non-@acronym{MIME} attachments are displayed using the face
  159. @code{mm-uu-extract}. By default, no @acronym{MIME} button for these
  160. parts is displayed. You can force displaying a button using @kbd{K b}
  161. (@code{gnus-summary-display-buttonized}) or add @code{text/x-verbatim}
  162. to @code{gnus-buttonized-mime-types}, @xref{MIME Commands, ,MIME
  163. Commands, gnus, Gnus Manual}.
  164. @node Handles
  165. @section Handles
  166. A @acronym{MIME} handle is a list that fully describes a @acronym{MIME}
  167. component.
  168. The following macros can be used to access elements in a handle:
  169. @table @code
  170. @item mm-handle-buffer
  171. @findex mm-handle-buffer
  172. Return the buffer that holds the contents of the undecoded @acronym{MIME}
  173. part.
  174. @item mm-handle-type
  175. @findex mm-handle-type
  176. Return the parsed @code{Content-Type} of the part.
  177. @item mm-handle-encoding
  178. @findex mm-handle-encoding
  179. Return the @code{Content-Transfer-Encoding} of the part.
  180. @item mm-handle-undisplayer
  181. @findex mm-handle-undisplayer
  182. Return the object that can be used to remove the displayed part (if it
  183. has been displayed).
  184. @item mm-handle-set-undisplayer
  185. @findex mm-handle-set-undisplayer
  186. Set the undisplayer object.
  187. @item mm-handle-disposition
  188. @findex mm-handle-disposition
  189. Return the parsed @code{Content-Disposition} of the part.
  190. @item mm-get-content-id
  191. Returns the handle(s) referred to by @code{Content-ID}.
  192. @end table
  193. @node Display
  194. @section Display
  195. Functions for displaying, removing and saving.
  196. @table @code
  197. @item mm-display-part
  198. @findex mm-display-part
  199. Display the part.
  200. @item mm-remove-part
  201. @findex mm-remove-part
  202. Remove the part (if it has been displayed).
  203. @item mm-inlinable-p
  204. @findex mm-inlinable-p
  205. Say whether a @acronym{MIME} type can be displayed inline.
  206. @item mm-automatic-display-p
  207. @findex mm-automatic-display-p
  208. Say whether a @acronym{MIME} type should be displayed automatically.
  209. @item mm-destroy-part
  210. @findex mm-destroy-part
  211. Free all resources occupied by a part.
  212. @item mm-save-part
  213. @findex mm-save-part
  214. Offer to save the part in a file.
  215. @item mm-pipe-part
  216. @findex mm-pipe-part
  217. Offer to pipe the part to some process.
  218. @item mm-interactively-view-part
  219. @findex mm-interactively-view-part
  220. Prompt for a mailcap method to use to view the part.
  221. @end table
  222. @node Display Customization
  223. @section Display Customization
  224. @table @code
  225. @item mm-inline-media-tests
  226. @vindex mm-inline-media-tests
  227. This is an alist where the key is a @acronym{MIME} type, the second element
  228. is a function to display the part @dfn{inline} (i.e., inside Emacs), and
  229. the third element is a form to be @code{eval}ed to say whether the part
  230. can be displayed inline.
  231. This variable specifies whether a part @emph{can} be displayed inline,
  232. and, if so, how to do it. It does not say whether parts are
  233. @emph{actually} displayed inline.
  234. @item mm-inlined-types
  235. @vindex mm-inlined-types
  236. This, on the other hand, says what types are to be displayed inline, if
  237. they satisfy the conditions set by the variable above. It's a list of
  238. @acronym{MIME} media types.
  239. @item mm-automatic-display
  240. @vindex mm-automatic-display
  241. This is a list of types that are to be displayed ``automatically'', but
  242. only if the above variable allows it. That is, only inlinable parts can
  243. be displayed automatically.
  244. @item mm-automatic-external-display
  245. @vindex mm-automatic-external-display
  246. This is a list of types that will be displayed automatically in an
  247. external viewer.
  248. @item mm-keep-viewer-alive-types
  249. @vindex mm-keep-viewer-alive-types
  250. This is a list of media types for which the external viewer will not
  251. be killed when selecting a different article.
  252. @item mm-attachment-override-types
  253. @vindex mm-attachment-override-types
  254. Some @acronym{MIME} agents create parts that have a content-disposition of
  255. @samp{attachment}. This variable allows overriding that disposition and
  256. displaying the part inline. (Note that the disposition is only
  257. overridden if we are able to, and want to, display the part inline.)
  258. @item mm-discouraged-alternatives
  259. @vindex mm-discouraged-alternatives
  260. List of @acronym{MIME} types that are discouraged when viewing
  261. @samp{multipart/alternative}. Viewing agents are supposed to view the
  262. last possible part of a message, as that is supposed to be the richest.
  263. However, users may prefer other types instead, and this list says what
  264. types are most unwanted. If, for instance, @samp{text/html} parts are
  265. very unwanted, and @samp{text/richtext} parts are somewhat unwanted,
  266. you could say something like:
  267. @lisp
  268. (setq mm-discouraged-alternatives
  269. '("text/html" "text/richtext")
  270. mm-automatic-display
  271. (remove "text/html" mm-automatic-display))
  272. @end lisp
  273. Adding @code{"image/.*"} might also be useful. Spammers use images as
  274. the preferred part of @samp{multipart/alternative} messages, so you might
  275. not notice there are other parts. See also
  276. @code{gnus-buttonized-mime-types}, @ref{MIME Commands, ,MIME Commands,
  277. gnus, Gnus Manual}. After adding @code{"multipart/alternative"} to
  278. @code{gnus-buttonized-mime-types} you can choose manually which
  279. alternative you'd like to view. For example, you can set those
  280. variables like:
  281. @lisp
  282. (setq gnus-buttonized-mime-types
  283. '("multipart/alternative" "multipart/signed")
  284. mm-discouraged-alternatives
  285. '("text/html" "image/.*"))
  286. @end lisp
  287. In this case, Gnus will display radio buttons for such a kind of spam
  288. message as follows:
  289. @example
  290. 1. (*) multipart/alternative ( ) image/gif
  291. 2. (*) text/plain ( ) text/html
  292. @end example
  293. @item mm-inline-large-images
  294. @vindex mm-inline-large-images
  295. When displaying inline images that are larger than the window, Emacs
  296. does not enable scrolling, which means that you cannot see the whole
  297. image. To prevent this, the library tries to determine the image size
  298. before displaying it inline, and if it doesn't fit the window, the
  299. library will display it externally (e.g., with @samp{ImageMagick} or
  300. @samp{xv}). Setting this variable to @code{t} disables this check and
  301. makes the library display all inline images as inline, regardless of
  302. their size. If you set this variable to @code{resize}, the image will
  303. be displayed resized to fit in the window, if Emacs has the ability to
  304. resize images.
  305. @item mm-inline-large-images-proportion
  306. @vindex mm-inline-images-max-proportion
  307. The proportion used when resizing large images.
  308. @item mm-inline-override-types
  309. @vindex mm-inline-override-types
  310. @code{mm-inlined-types} may include regular expressions, for example to
  311. specify that all @samp{text/.*} parts be displayed inline. If a user
  312. prefers to have a type that matches such a regular expression be treated
  313. as an attachment, that can be accomplished by setting this variable to a
  314. list containing that type. For example assuming @code{mm-inlined-types}
  315. includes @samp{text/.*}, then including @samp{text/html} in this
  316. variable will cause @samp{text/html} parts to be treated as attachments.
  317. @item mm-text-html-renderer
  318. @vindex mm-text-html-renderer
  319. This selects the function used to render @acronym{HTML}. The predefined
  320. renderers are selected by the symbols @code{gnus-article-html},
  321. @code{w3m}@footnote{See @uref{http://emacs-w3m.namazu.org/} for more
  322. information about emacs-w3m}, @code{links}, @code{lynx},
  323. @code{w3m-standalone} or @code{html2text}. If @code{nil} use an
  324. external viewer. You can also specify a function, which will be
  325. called with a @acronym{MIME} handle as the argument.
  326. @item mm-html-inhibit-images
  327. @vindex mm-html-inhibit-images
  328. @vindex mm-inline-text-html-with-images
  329. If this is non-@code{nil}, inhibit displaying of images inline in the
  330. article body. It is effective to images in @acronym{HTML} articles
  331. rendered when @code{mm-text-html-renderer} (@pxref{Display
  332. Customization}) is @code{shr} or @code{w3m}. In Gnus, this is
  333. overridden by the value of @code{gnus-inhibit-images} (@pxref{Misc
  334. Article, ,Misc Article, gnus, Gnus manual}). The default is @code{nil}.
  335. @item mm-html-blocked-images
  336. @vindex mm-html-blocked-images
  337. External images that have @acronym{URL}s that match this regexp won't
  338. be fetched and displayed. For instance, to block all @acronym{URL}s
  339. that have the string ``ads'' in them, do the following:
  340. @lisp
  341. (setq mm-html-blocked-images "ads")
  342. @end lisp
  343. It is effective when @code{mm-text-html-renderer} (@pxref{Display
  344. Customization}) is @code{shr}. In Gnus, this is overridden by the value
  345. of @code{gnus-blocked-images} or the return value of the function that
  346. @code{gnus-blocked-images} is set to (@pxref{HTML, ,HTML, gnus, Gnus
  347. manual}).
  348. Some @acronym{HTML} mails might have the trick of spammers using
  349. @samp{<img>} tags. It is likely to be intended to verify whether you
  350. have read the mail. You can prevent your personal information from
  351. leaking by setting this option to @code{""} (which is the default).
  352. @item mm-w3m-safe-url-regexp
  353. @vindex mm-w3m-safe-url-regexp
  354. A regular expression that matches safe URL names, i.e., URLs that are
  355. unlikely to leak personal information when rendering @acronym{HTML}
  356. email (the default value is @samp{\\`cid:}). If @code{nil} consider
  357. all URLs safe. In Gnus, this will be overridden according to the value
  358. of the variable @code{gnus-safe-html-newsgroups}, @xref{Various
  359. Various, ,Various Various, gnus, Gnus Manual}.
  360. @item mm-inline-text-html-with-w3m-keymap
  361. @vindex mm-inline-text-html-with-w3m-keymap
  362. You can use emacs-w3m command keys in the inlined text/html part by
  363. setting this option to non-@code{nil}. The default value is @code{t}.
  364. @item mm-external-terminal-program
  365. @vindex mm-external-terminal-program
  366. The program used to start an external terminal.
  367. @item mm-enable-external
  368. @vindex mm-enable-external
  369. Indicate whether external @acronym{MIME} handlers should be used.
  370. If @code{t}, all defined external @acronym{MIME} handlers are used. If
  371. @code{nil}, files are saved to disk (@code{mailcap-save-binary-file}).
  372. If it is the symbol @code{ask}, you are prompted before the external
  373. @acronym{MIME} handler is invoked.
  374. When you launch an attachment through mailcap (@pxref{mailcap}) an
  375. attempt is made to use a safe viewer with the safest options---this isn't
  376. the case if you save it to disk and launch it in a different way
  377. (command line or double-clicking). Anyhow, if you want to be sure not
  378. to launch any external programs, set this variable to @code{nil} or
  379. @code{ask}.
  380. @end table
  381. @node Files and Directories
  382. @section Files and Directories
  383. @table @code
  384. @item mm-default-directory
  385. @vindex mm-default-directory
  386. The default directory for saving attachments. If @code{nil} use
  387. @code{default-directory}.
  388. @item mm-tmp-directory
  389. @vindex mm-tmp-directory
  390. Directory for storing temporary files.
  391. @item mm-file-name-rewrite-functions
  392. @vindex mm-file-name-rewrite-functions
  393. A list of functions used for rewriting file names of @acronym{MIME}
  394. parts. Each function is applied successively to the file name.
  395. Ready-made functions include
  396. @table @code
  397. @item mm-file-name-delete-control
  398. @findex mm-file-name-delete-control
  399. Delete all control characters.
  400. @item mm-file-name-delete-gotchas
  401. @findex mm-file-name-delete-gotchas
  402. Delete characters that could have unintended consequences when used
  403. with flawed shell scripts, i.e., @samp{|}, @samp{>} and @samp{<}; and
  404. @samp{-}, @samp{.} as the first character.
  405. @item mm-file-name-delete-whitespace
  406. @findex mm-file-name-delete-whitespace
  407. Remove all whitespace.
  408. @item mm-file-name-trim-whitespace
  409. @findex mm-file-name-trim-whitespace
  410. Remove leading and trailing whitespace.
  411. @item mm-file-name-collapse-whitespace
  412. @findex mm-file-name-collapse-whitespace
  413. Collapse multiple whitespace characters.
  414. @item mm-file-name-replace-whitespace
  415. @findex mm-file-name-replace-whitespace
  416. @vindex mm-file-name-replace-whitespace
  417. Replace whitespace with underscores. Set the variable
  418. @code{mm-file-name-replace-whitespace} to any other string if you do
  419. not like underscores.
  420. @end table
  421. The standard Emacs functions @code{capitalize}, @code{downcase},
  422. @code{upcase} and @code{upcase-initials} might also prove useful.
  423. @item mm-path-name-rewrite-functions
  424. @vindex mm-path-name-rewrite-functions
  425. List of functions used for rewriting the full file names of @acronym{MIME}
  426. parts. This is used when viewing parts externally, and is meant for
  427. transforming the absolute name so that non-compliant programs can find
  428. the file where it's saved.
  429. @end table
  430. @node New Viewers
  431. @section New Viewers
  432. Here's an example viewer for displaying @code{text/enriched} inline:
  433. @lisp
  434. (defun mm-display-enriched-inline (handle)
  435. (let (text)
  436. (with-temp-buffer
  437. (mm-insert-part handle)
  438. (save-window-excursion
  439. (enriched-decode (point-min) (point-max))
  440. (setq text (buffer-string))))
  441. (mm-insert-inline handle text)))
  442. @end lisp
  443. We see that the function takes a @acronym{MIME} handle as its parameter. It
  444. then goes to a temporary buffer, inserts the text of the part, does some
  445. work on the text, stores the result, goes back to the buffer it was
  446. called from and inserts the result.
  447. The two important helper functions here are @code{mm-insert-part} and
  448. @code{mm-insert-inline}. The first function inserts the text of the
  449. handle in the current buffer. It handles charset and/or content
  450. transfer decoding. The second function just inserts whatever text you
  451. tell it to insert, but it also sets things up so that the text can be
  452. ``undisplayed'' in a convenient manner.
  453. @node Composing
  454. @chapter Composing
  455. @cindex Composing
  456. @cindex MIME Composing
  457. @cindex MML
  458. @cindex MIME Meta Language
  459. Creating a @acronym{MIME} message is boring and non-trivial. Therefore,
  460. a library called @code{mml} has been defined that parses a language
  461. called @acronym{MML} (@acronym{MIME} Meta Language) and generates
  462. @acronym{MIME} messages.
  463. @findex mml-generate-mime
  464. The main interface function is @code{mml-generate-mime}. It will
  465. examine the contents of the current (narrowed-to) buffer and return a
  466. string containing the @acronym{MIME} message.
  467. @menu
  468. * Simple MML Example:: An example @acronym{MML} document.
  469. * MML Definition:: All valid @acronym{MML} elements.
  470. * Advanced MML Example:: Another example @acronym{MML} document.
  471. * Encoding Customization:: Variables that affect encoding.
  472. * Charset Translation:: How charsets are mapped from @sc{mule} to @acronym{MIME}.
  473. * Conversion:: Going from @acronym{MIME} to @acronym{MML} and vice versa.
  474. * Flowed text:: Soft and hard newlines.
  475. @end menu
  476. @node Simple MML Example
  477. @section Simple MML Example
  478. Here's a simple @samp{multipart/alternative}:
  479. @example
  480. <#multipart type=alternative>
  481. This is a plain text part.
  482. <#part type=text/enriched>
  483. <center>This is a centered enriched part</center>
  484. <#/multipart>
  485. @end example
  486. After running this through @code{mml-generate-mime}, we get this:
  487. @example
  488. Content-Type: multipart/alternative; boundary="=-=-="
  489. --=-=-=
  490. This is a plain text part.
  491. --=-=-=
  492. Content-Type: text/enriched
  493. <center>This is a centered enriched part</center>
  494. --=-=-=--
  495. @end example
  496. @node MML Definition
  497. @section MML Definition
  498. The @acronym{MML} language is very simple. It looks a bit like an SGML
  499. application, but it's not.
  500. The main concept of @acronym{MML} is the @dfn{part}. Each part can be of a
  501. different type or use a different charset. The way to delineate a part
  502. is with a @samp{<#part ...>} tag. Multipart parts can be introduced
  503. with the @samp{<#multipart ...>} tag. Parts are ended by the
  504. @samp{<#/part>} or @samp{<#/multipart>} tags. Parts started with the
  505. @samp{<#part ...>} tags are also closed by the next open tag.
  506. There's also the @samp{<#external ...>} tag. These introduce
  507. @samp{external/message-body} parts.
  508. Each tag can contain zero or more parameters on the form
  509. @samp{parameter=value}. The values may be enclosed in quotation marks,
  510. but that's not necessary unless the value contains white space. So
  511. @samp{filename=/home/user/#hello$^yes} is perfectly valid.
  512. The following parameters have meaning in @acronym{MML}; parameters that have no
  513. meaning are ignored. The @acronym{MML} parameter names are the same as the
  514. @acronym{MIME} parameter names; the things in the parentheses say which
  515. header it will be used in.
  516. @table @samp
  517. @item type
  518. The @acronym{MIME} type of the part (@code{Content-Type}).
  519. @item filename
  520. Use the contents of the file in the body of the part
  521. (@code{Content-Disposition}).
  522. @item recipient-filename
  523. Use this as the file name in the generated @acronym{MIME} message for
  524. the recipient. That is, even if the file is called @file{foo.txt}
  525. locally, use this name instead in the @code{Content-Disposition} in
  526. the sent message.
  527. @item charset
  528. The contents of the body of the part are to be encoded in the character
  529. set specified (@code{Content-Type}). @xref{Charset Translation}.
  530. @item name
  531. Might be used to suggest a file name if the part is to be saved
  532. to a file (@code{Content-Type}).
  533. @item disposition
  534. Valid values are @samp{inline} and @samp{attachment}
  535. (@code{Content-Disposition}).
  536. @item encoding
  537. Valid values are @samp{7bit}, @samp{8bit}, @samp{quoted-printable} and
  538. @samp{base64} (@code{Content-Transfer-Encoding}). @xref{Charset
  539. Translation}.
  540. @item description
  541. A description of the part (@code{Content-Description}).
  542. @item creation-date
  543. RFC822 date when the part was created (@code{Content-Disposition}).
  544. @item modification-date
  545. RFC822 date when the part was modified (@code{Content-Disposition}).
  546. @item read-date
  547. RFC822 date when the part was read (@code{Content-Disposition}).
  548. @item recipients
  549. Who to encrypt/sign the part to. This field is used to override any
  550. auto-detection based on the To/CC headers.
  551. @item sender
  552. Identity used to sign the part. This field is used to override the
  553. default key used.
  554. @item size
  555. The size (in octets) of the part (@code{Content-Disposition}).
  556. @item sign
  557. What technology to sign this @acronym{MML} part with (@code{smime}, @code{pgp}
  558. or @code{pgpmime})
  559. @item encrypt
  560. What technology to encrypt this @acronym{MML} part with (@code{smime},
  561. @code{pgp} or @code{pgpmime})
  562. @end table
  563. Parameters for @samp{text/plain}:
  564. @table @samp
  565. @item format
  566. Formatting parameter for the text, valid values include @samp{fixed}
  567. (the default) and @samp{flowed}. Normally you do not specify this
  568. manually, since it requires the textual body to be formatted in a
  569. special way described in RFC 2646. @xref{Flowed text}.
  570. @end table
  571. Parameters for @samp{application/octet-stream}:
  572. @table @samp
  573. @item type
  574. Type of the part; informal---meant for human readers
  575. (@code{Content-Type}).
  576. @end table
  577. Parameters for @samp{message/external-body}:
  578. @table @samp
  579. @item access-type
  580. A word indicating the supported access mechanism by which the file may
  581. be obtained. Values include @samp{ftp}, @samp{anon-ftp}, @samp{tftp},
  582. @samp{localfile}, and @samp{mailserver}. (@code{Content-Type}.)
  583. @item expiration
  584. The RFC822 date after which the file may no longer be fetched.
  585. (@code{Content-Type}.)
  586. @item size
  587. The size (in octets) of the file. (@code{Content-Type}.)
  588. @item permission
  589. Valid values are @samp{read} and @samp{read-write}
  590. (@code{Content-Type}).
  591. @end table
  592. Parameters for @samp{sign=smime}:
  593. @table @samp
  594. @item keyfile
  595. File containing key and certificate for signer.
  596. @end table
  597. Parameters for @samp{encrypt=smime}:
  598. @table @samp
  599. @item certfile
  600. File containing certificate for recipient.
  601. @end table
  602. @node Advanced MML Example
  603. @section Advanced MML Example
  604. Here's a complex multipart message. It's a @samp{multipart/mixed} that
  605. contains many parts, one of which is a @samp{multipart/alternative}.
  606. @example
  607. <#multipart type=mixed>
  608. <#part type=image/jpeg filename=~/rms.jpg disposition=inline>
  609. <#multipart type=alternative>
  610. This is a plain text part.
  611. <#part type=text/enriched name=enriched.txt>
  612. <center>This is a centered enriched part</center>
  613. <#/multipart>
  614. This is a new plain text part.
  615. <#part disposition=attachment>
  616. This plain text part is an attachment.
  617. <#/multipart>
  618. @end example
  619. And this is the resulting @acronym{MIME} message:
  620. @example
  621. Content-Type: multipart/mixed; boundary="=-=-="
  622. --=-=-=
  623. --=-=-=
  624. Content-Type: image/jpeg;
  625. filename="~/rms.jpg"
  626. Content-Disposition: inline;
  627. filename="~/rms.jpg"
  628. Content-Transfer-Encoding: base64
  629. /9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRof
  630. Hh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/wAALCAAwADABAREA/8QAHwAA
  631. AQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQR
  632. BRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RF
  633. RkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ip
  634. qrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/9oACAEB
  635. AAA/AO/rifFHjldNuGsrDa0qcSSHkA+gHrXKw+LtWLrMb+RgTyhbr+HSug07xNqV9fQtZrNI
  636. AyiaE/NuBPOOOP0rvRNE880KOC8TbXXGCv1FPqjrF4LDR7u5L7SkTFT/ALWOP1xXgTuXfc7E
  637. sx6nua6rwp4IvvEM8chCxWxOdzn7wz6V9AaB4S07w9p5itow0rDLSY5Pt9K43xO66P4xs71m
  638. 2QXiGCbA4yOVJ9+1aYORkdK434lyNH4ahCnG66VT9Nj15JFbPdX0MS43M4VQf5/yr2vSpLnw
  639. 5ZW8dlCZ8KFXjOPX0/mK6rSPEGt3Angu44fNEReHYNvIH3TzXDeKNO8RX+kSX2ouZkicTIOc
  640. L+g7E810ulFjpVtv3bwgB3HJyK5L4quY/C9sVxk3ij/xx6850u7t1mtp/wDlpEw3An3Jr3Dw
  641. 34gsbWza4nBlhC5LDsaW6+IFgupQyCF3iHH7gA7c9R9ay7zx6t7aX9jHC4smhfBkGCvHGfrm
  642. tLQ7hbnRrV1GPkAP1x1/Hr+Ncr8Vzjwrbf8AX6v/AKA9eQRyYlQk8Yx9K6XTNbkgia2ciSIn
  643. 7p5Ga9Atte0LTLKO6it4i7dVRFJDcZ4PvXN+JvEMF9bILVGXJLSZ4zkjivRPDaeX4b08HOTC
  644. pOffmua+KkbS+GLVUGT9tT/0B68eeIpIFYjB70+OOVXyoOM9+M1eaWeCLzHPyHGO/NVWvJJm
  645. jQ8KGH1NfQWhXSXmh2c8eArRLwO3HSv/2Q==
  646. --=-=-=
  647. Content-Type: multipart/alternative; boundary="==-=-="
  648. --==-=-=
  649. This is a plain text part.
  650. --==-=-=
  651. Content-Type: text/enriched;
  652. name="enriched.txt"
  653. <center>This is a centered enriched part</center>
  654. --==-=-=--
  655. --=-=-=
  656. This is a new plain text part.
  657. --=-=-=
  658. Content-Disposition: attachment
  659. This plain text part is an attachment.
  660. --=-=-=--
  661. @end example
  662. @node Encoding Customization
  663. @section Encoding Customization
  664. @table @code
  665. @item mm-body-charset-encoding-alist
  666. @vindex mm-body-charset-encoding-alist
  667. Mapping from @acronym{MIME} charset to encoding to use. This variable is
  668. usually used except, e.g., when other requirements force a specific
  669. encoding (digitally signed messages require 7bit encodings). The
  670. default is
  671. @lisp
  672. ((iso-2022-jp . 7bit)
  673. (iso-2022-jp-2 . 7bit)
  674. (utf-16 . base64)
  675. (utf-16be . base64)
  676. (utf-16le . base64))
  677. @end lisp
  678. As an example, if you do not want to have ISO-8859-1 characters
  679. quoted-printable encoded, you may add @code{(iso-8859-1 . 8bit)} to
  680. this variable. You can override this setting on a per-message basis
  681. by using the @code{encoding} @acronym{MML} tag (@pxref{MML Definition}).
  682. @item mm-coding-system-priorities
  683. @vindex mm-coding-system-priorities
  684. Prioritize coding systems to use for outgoing messages. The default
  685. is @code{nil}, which means to use the defaults in Emacs, but is
  686. @code{(iso-8859-1 iso-2022-jp utf-8)} when running Emacs in the Japanese
  687. language environment. It is a list of coding system symbols (aliases of
  688. coding systems are also allowed, use @kbd{M-x describe-coding-system} to
  689. make sure you are specifying correct coding system names). For example,
  690. if you have configured Emacs to prefer UTF-8, but wish that outgoing
  691. messages should be sent in ISO-8859-1 if possible, you can set this
  692. variable to @code{(iso-8859-1)}. You can override this setting on a
  693. per-message basis by using the @code{charset} @acronym{MML} tag
  694. (@pxref{MML Definition}).
  695. As different hierarchies prefer different charsets, you may want to set
  696. @code{mm-coding-system-priorities} according to the hierarchy in Gnus.
  697. Here's an example:
  698. @c Corrections about preferred charsets are welcome. de, fr and fj
  699. @c should be correct, I don't know about the rest (so these are only
  700. @c examples):
  701. @lisp
  702. (add-to-list 'gnus-newsgroup-variables 'mm-coding-system-priorities)
  703. (setq gnus-parameters
  704. (nconc
  705. ;; Some charsets are just examples!
  706. '(("^cn\\." ;; Chinese
  707. (mm-coding-system-priorities
  708. '(iso-8859-1 cn-big5 chinese-iso-7bit utf-8)))
  709. ("^cz\\.\\|^pl\\." ;; Central and Eastern European
  710. (mm-coding-system-priorities '(iso-8859-2 utf-8)))
  711. ("^de\\." ;; German language
  712. (mm-coding-system-priorities '(iso-8859-1 iso-8859-15 utf-8)))
  713. ("^fr\\." ;; French
  714. (mm-coding-system-priorities '(iso-8859-15 iso-8859-1 utf-8)))
  715. ("^fj\\." ;; Japanese
  716. (mm-coding-system-priorities
  717. '(iso-8859-1 iso-2022-jp utf-8)))
  718. ("^ru\\." ;; Cyrillic
  719. (mm-coding-system-priorities
  720. '(koi8-r iso-8859-5 iso-8859-1 utf-8))))
  721. gnus-parameters))
  722. @end lisp
  723. @item mm-content-transfer-encoding-defaults
  724. @vindex mm-content-transfer-encoding-defaults
  725. Mapping from @acronym{MIME} types to encoding to use. This variable is usually
  726. used except, e.g., when other requirements force a safer encoding
  727. (digitally signed messages require 7bit encoding). Besides the normal
  728. @acronym{MIME} encodings, @code{qp-or-base64} may be used to indicate that for
  729. each case the most efficient of quoted-printable and base64 should be
  730. used.
  731. @code{qp-or-base64} has another effect. It will fold long lines so that
  732. MIME parts may not be broken by MTA@. So do @code{quoted-printable} and
  733. @code{base64}.
  734. Note that it affects body encoding only when a part is a raw forwarded
  735. message (which will be made by @code{gnus-summary-mail-forward} with the
  736. arg 2 for example) or is neither the @samp{text/*} type nor the
  737. @samp{message/*} type. Even though in those cases, you can override
  738. this setting on a per-message basis by using the @code{encoding}
  739. @acronym{MML} tag (@pxref{MML Definition}).
  740. @item mm-use-ultra-safe-encoding
  741. @vindex mm-use-ultra-safe-encoding
  742. When this is non-@code{nil}, it means that textual parts are encoded as
  743. quoted-printable if they contain lines longer than 76 characters or
  744. starting with "From " in the body. Non-7bit encodings (8bit, binary)
  745. are generally disallowed. This reduce the probability that a non-8bit
  746. clean MTA or MDA changes the message. This should never be set
  747. directly, but bound by other functions when necessary (e.g., when
  748. encoding messages that are to be digitally signed).
  749. @end table
  750. @node Charset Translation
  751. @section Charset Translation
  752. @cindex charsets
  753. During translation from @acronym{MML} to @acronym{MIME}, for each
  754. @acronym{MIME} part which has been composed inside Emacs, an appropriate
  755. charset has to be chosen.
  756. @vindex mail-parse-charset
  757. If you are running a non-@sc{mule} Emacs, this process is simple: If the
  758. part contains any non-@acronym{ASCII} (8-bit) characters, the @acronym{MIME} charset
  759. given by @code{mail-parse-charset} (a symbol) is used. (Never set this
  760. variable directly, though. If you want to change the default charset,
  761. please consult the documentation of the package which you use to process
  762. @acronym{MIME} messages.
  763. @xref{Various Message Variables, , Various Message Variables, message,
  764. Message Manual}, for example.)
  765. If there are only @acronym{ASCII} characters, the @acronym{MIME} charset US-ASCII is
  766. used, of course.
  767. @cindex MULE
  768. @cindex UTF-8
  769. @cindex Unicode
  770. @vindex mm-mime-mule-charset-alist
  771. Things are slightly more complicated when running Emacs with @sc{mule}
  772. support. In this case, a list of the @sc{mule} charsets used in the
  773. part is obtained, and the @sc{mule} charsets are translated to
  774. @acronym{MIME} charsets by consulting the table provided by Emacs itself
  775. or the variable @code{mm-mime-mule-charset-alist} for XEmacs.
  776. If this results in a single @acronym{MIME} charset, this is used to encode
  777. the part. But if the resulting list of @acronym{MIME} charsets contains more
  778. than one element, two things can happen: If it is possible to encode the
  779. part via UTF-8, this charset is used. (For this, Emacs must support
  780. the @code{utf-8} coding system, and the part must consist entirely of
  781. characters which have Unicode counterparts.) If UTF-8 is not available
  782. for some reason, the part is split into several ones, so that each one
  783. can be encoded with a single @acronym{MIME} charset. The part can only be
  784. split at line boundaries, though---if more than one @acronym{MIME} charset is
  785. required to encode a single line, it is not possible to encode the part.
  786. When running Emacs with @sc{mule} support, the preferences for which
  787. coding system to use is inherited from Emacs itself. This means that
  788. if Emacs is set up to prefer UTF-8, it will be used when encoding
  789. messages. You can modify this by altering the
  790. @code{mm-coding-system-priorities} variable though (@pxref{Encoding
  791. Customization}).
  792. The charset to be used can be overridden by setting the @code{charset}
  793. @acronym{MML} tag (@pxref{MML Definition}) when composing the message.
  794. The encoding of characters (quoted-printable, 8bit, etc.)@: is orthogonal
  795. to the discussion here, and is controlled by the variables
  796. @code{mm-body-charset-encoding-alist} and
  797. @code{mm-content-transfer-encoding-defaults} (@pxref{Encoding
  798. Customization}).
  799. @node Conversion
  800. @section Conversion
  801. @findex mime-to-mml
  802. A (multipart) @acronym{MIME} message can be converted to @acronym{MML}
  803. with the @code{mime-to-mml} function. It works on the message in the
  804. current buffer, and substitutes @acronym{MML} markup for @acronym{MIME}
  805. boundaries. Non-textual parts do not have their contents in the buffer,
  806. but instead have the contents in separate buffers that are referred to
  807. from the @acronym{MML} tags.
  808. @findex mml-to-mime
  809. An @acronym{MML} message can be converted back to @acronym{MIME} by the
  810. @code{mml-to-mime} function.
  811. These functions are in certain senses ``lossy''---you will not get back
  812. an identical message if you run @code{mime-to-mml} and then
  813. @code{mml-to-mime}. Not only will trivial things like the order of the
  814. headers differ, but the contents of the headers may also be different.
  815. For instance, the original message may use base64 encoding on text,
  816. while @code{mml-to-mime} may decide to use quoted-printable encoding, and
  817. so on.
  818. In essence, however, these two functions should be the inverse of each
  819. other. The resulting contents of the message should remain equivalent,
  820. if not identical.
  821. @node Flowed text
  822. @section Flowed text
  823. @cindex format=flowed
  824. The Emacs @acronym{MIME} library will respect the @code{use-hard-newlines}
  825. variable (@pxref{Hard and Soft Newlines, ,Hard and Soft Newlines,
  826. emacs, Emacs Manual}) when encoding a message, and the
  827. ``format=flowed'' Content-Type parameter when decoding a message.
  828. On encoding text, regardless of @code{use-hard-newlines}, lines
  829. terminated by soft newline characters are filled together and wrapped
  830. after the column decided by @code{fill-flowed-encode-column}.
  831. Quotation marks (matching @samp{^>* ?}) are respected. The variable
  832. controls how the text will look in a client that does not support
  833. flowed text, the default is to wrap after 66 characters. If hard
  834. newline characters are not present in the buffer, no flow encoding
  835. occurs.
  836. You can customize the value of the @code{mml-enable-flowed} variable
  837. to enable or disable the flowed encoding usage when newline
  838. characters are present in the buffer.
  839. On decoding flowed text, lines with soft newline characters are filled
  840. together and wrapped after the column decided by
  841. @code{fill-flowed-display-column}. The default is to wrap after
  842. @code{fill-column}.
  843. @table @code
  844. @item mm-fill-flowed
  845. @vindex mm-fill-flowed
  846. If non-@code{nil} a format=flowed article will be displayed flowed.
  847. @end table
  848. @node Interface Functions
  849. @chapter Interface Functions
  850. @cindex interface functions
  851. @cindex mail-parse
  852. The @code{mail-parse} library is an abstraction over the actual
  853. low-level libraries that are described in the next chapter.
  854. Standards change, and so programs have to change to fit in the new
  855. mold. For instance, RFC2045 describes a syntax for the
  856. @code{Content-Type} header that only allows @acronym{ASCII} characters in the
  857. parameter list. RFC2231 expands on RFC2045 syntax to provide a scheme
  858. for continuation headers and non-@acronym{ASCII} characters.
  859. The traditional way to deal with this is just to update the library
  860. functions to parse the new syntax. However, this is sometimes the wrong
  861. thing to do. In some instances it may be vital to be able to understand
  862. both the old syntax as well as the new syntax, and if there is only one
  863. library, one must choose between the old version of the library and the
  864. new version of the library.
  865. The Emacs @acronym{MIME} library takes a different tack. It defines a
  866. series of low-level libraries (@file{rfc2047.el}, @file{rfc2231.el}
  867. and so on) that parses strictly according to the corresponding
  868. standard. However, normal programs would not use the functions
  869. provided by these libraries directly, but instead use the functions
  870. provided by the @code{mail-parse} library. The functions in this
  871. library are just aliases to the corresponding functions in the latest
  872. low-level libraries. Using this scheme, programs get a consistent
  873. interface they can use, and library developers are free to create
  874. write code that handles new standards.
  875. The following functions are defined by this library:
  876. @table @code
  877. @item mail-header-parse-content-type
  878. @findex mail-header-parse-content-type
  879. Parse a @code{Content-Type} header and return a list on the following
  880. format:
  881. @lisp
  882. ("type/subtype"
  883. (attribute1 . value1)
  884. (attribute2 . value2)
  885. ...)
  886. @end lisp
  887. Here's an example:
  888. @example
  889. (mail-header-parse-content-type
  890. "image/gif; name=\"b980912.gif\"")
  891. @result{} ("image/gif" (name . "b980912.gif"))
  892. @end example
  893. @item mail-header-parse-content-disposition
  894. @findex mail-header-parse-content-disposition
  895. Parse a @code{Content-Disposition} header and return a list on the same
  896. format as the function above.
  897. @item mail-content-type-get
  898. @findex mail-content-type-get
  899. Takes two parameters---a list on the format above, and an attribute.
  900. Returns the value of the attribute.
  901. @example
  902. (mail-content-type-get
  903. '("image/gif" (name . "b980912.gif")) 'name)
  904. @result{} "b980912.gif"
  905. @end example
  906. @item mail-header-encode-parameter
  907. @findex mail-header-encode-parameter
  908. Takes a parameter string and returns an encoded version of the string.
  909. This is used for parameters in headers like @code{Content-Type} and
  910. @code{Content-Disposition}.
  911. @item mail-header-remove-comments
  912. @findex mail-header-remove-comments
  913. Return a comment-free version of a header.
  914. @example
  915. (mail-header-remove-comments
  916. "Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)")
  917. @result{} "Gnus/5.070027 "
  918. @end example
  919. @item mail-header-remove-whitespace
  920. @findex mail-header-remove-whitespace
  921. Remove linear white space from a header. Space inside quoted strings
  922. and comments is preserved.
  923. @example
  924. (mail-header-remove-whitespace
  925. "image/gif; name=\"Name with spaces\"")
  926. @result{} "image/gif;name=\"Name with spaces\""
  927. @end example
  928. @item mail-header-get-comment
  929. @findex mail-header-get-comment
  930. Return the last comment in a header.
  931. @example
  932. (mail-header-get-comment
  933. "Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)")
  934. @result{} "Finnish Landrace"
  935. @end example
  936. @item mail-header-parse-address
  937. @findex mail-header-parse-address
  938. Parse an address and return a list containing the mailbox and the
  939. plaintext name.
  940. @example
  941. (mail-header-parse-address
  942. "Hrvoje Niksic <hniksic@@srce.hr>")
  943. @result{} ("hniksic@@srce.hr" . "Hrvoje Niksic")
  944. @end example
  945. @item mail-header-parse-addresses
  946. @findex mail-header-parse-addresses
  947. Parse a string with list of addresses and return a list of elements like
  948. the one described above.
  949. @example
  950. (mail-header-parse-addresses
  951. "Hrvoje Niksic <hniksic@@srce.hr>, Steinar Bang <sb@@metis.no>")
  952. @result{} (("hniksic@@srce.hr" . "Hrvoje Niksic")
  953. ("sb@@metis.no" . "Steinar Bang"))
  954. @end example
  955. @item mail-header-parse-date
  956. @findex mail-header-parse-date
  957. Parse a date string and return an Emacs time structure.
  958. @item mail-narrow-to-head
  959. @findex mail-narrow-to-head
  960. Narrow the buffer to the header section of the buffer. Point is placed
  961. at the beginning of the narrowed buffer.
  962. @item mail-header-narrow-to-field
  963. @findex mail-header-narrow-to-field
  964. Narrow the buffer to the header under point. Understands continuation
  965. headers.
  966. @item mail-header-fold-field
  967. @findex mail-header-fold-field
  968. Fold the header under point.
  969. @item mail-header-unfold-field
  970. @findex mail-header-unfold-field
  971. Unfold the header under point.
  972. @item mail-header-field-value
  973. @findex mail-header-field-value
  974. Return the value of the field under point.
  975. @item mail-encode-encoded-word-region
  976. @findex mail-encode-encoded-word-region
  977. Encode the non-@acronym{ASCII} words in the region. For instance,
  978. @samp{Naïve} is encoded as @samp{=?iso-8859-1?q?Na=EFve?=}.
  979. @item mail-encode-encoded-word-buffer
  980. @findex mail-encode-encoded-word-buffer
  981. Encode the non-@acronym{ASCII} words in the current buffer. This function is
  982. meant to be called narrowed to the headers of a message.
  983. @item mail-encode-encoded-word-string
  984. @findex mail-encode-encoded-word-string
  985. Encode the words that need encoding in a string, and return the result.
  986. @example
  987. (mail-encode-encoded-word-string
  988. "This is naïve, baby")
  989. @result{} "This is =?iso-8859-1?q?na=EFve,?= baby"
  990. @end example
  991. @item mail-decode-encoded-word-region
  992. @findex mail-decode-encoded-word-region
  993. Decode the encoded words in the region.
  994. @item mail-decode-encoded-word-string
  995. @findex mail-decode-encoded-word-string
  996. Decode the encoded words in the string and return the result.
  997. @example
  998. (mail-decode-encoded-word-string
  999. "This is =?iso-8859-1?q?na=EFve,?= baby")
  1000. @result{} "This is naïve, baby"
  1001. @end example
  1002. @end table
  1003. Currently, @code{mail-parse} is an abstraction over @code{ietf-drums},
  1004. @code{rfc2047}, @code{rfc2045} and @code{rfc2231}. These are documented
  1005. in the subsequent sections.
  1006. @node Basic Functions
  1007. @chapter Basic Functions
  1008. This chapter describes the basic, ground-level functions for parsing and
  1009. handling. Covered here is parsing @code{From} lines, removing comments
  1010. from header lines, decoding encoded words, parsing date headers and so
  1011. on. High-level functionality is dealt with in the first chapter
  1012. (@pxref{Decoding and Viewing}).
  1013. @menu
  1014. * rfc2045:: Encoding @code{Content-Type} headers.
  1015. * rfc2231:: Parsing @code{Content-Type} headers.
  1016. * ietf-drums:: Handling mail headers defined by RFC822bis.
  1017. * rfc2047:: En/decoding encoded words in headers.
  1018. * time-date:: Functions for parsing dates and manipulating time.
  1019. * qp:: Quoted-Printable en/decoding.
  1020. * base64:: Base64 en/decoding.
  1021. * binhex:: Binhex decoding.
  1022. * uudecode:: Uuencode decoding.
  1023. * yenc:: Yenc decoding.
  1024. * rfc1843:: Decoding HZ-encoded text.
  1025. * mailcap:: How parts are displayed is specified by the @file{.mailcap} file
  1026. @end menu
  1027. @node rfc2045
  1028. @section rfc2045
  1029. RFC2045 is the ``main'' @acronym{MIME} document, and as such, one would
  1030. imagine that there would be a lot to implement. But there isn't, since
  1031. most of the implementation details are delegated to the subsequent
  1032. RFCs.
  1033. So @file{rfc2045.el} has only a single function:
  1034. @table @code
  1035. @item rfc2045-encode-string
  1036. @findex rfc2045-encode-string
  1037. Takes a parameter and a value and returns a @samp{PARAM=VALUE} string.
  1038. @var{value} will be quoted if there are non-safe characters in it.
  1039. @end table
  1040. @node rfc2231
  1041. @section rfc2231
  1042. RFC2231 defines a syntax for the @code{Content-Type} and
  1043. @code{Content-Disposition} headers. Its snappy name is @dfn{MIME
  1044. Parameter Value and Encoded Word Extensions: Character Sets, Languages,
  1045. and Continuations}.
  1046. In short, these headers look something like this:
  1047. @example
  1048. Content-Type: application/x-stuff;
  1049. title*0*=us-ascii'en'This%20is%20even%20more%20;
  1050. title*1*=%2A%2A%2Afun%2A%2A%2A%20;
  1051. title*2="isn't it!"
  1052. @end example
  1053. They usually aren't this bad, though.
  1054. The following functions are defined by this library:
  1055. @table @code
  1056. @item rfc2231-parse-string
  1057. @findex rfc2231-parse-string
  1058. Parse a @code{Content-Type} header and return a list describing its
  1059. elements.
  1060. @example
  1061. (rfc2231-parse-string
  1062. "application/x-stuff;
  1063. title*0*=us-ascii'en'This%20is%20even%20more%20;
  1064. title*1*=%2A%2A%2Afun%2A%2A%2A%20;
  1065. title*2=\"isn't it!\"")
  1066. @result{} ("application/x-stuff"
  1067. (title . "This is even more ***fun*** isn't it!"))
  1068. @end example
  1069. @item rfc2231-get-value
  1070. @findex rfc2231-get-value
  1071. Takes one of the lists on the format above and returns
  1072. the value of the specified attribute.
  1073. @item rfc2231-encode-string
  1074. @findex rfc2231-encode-string
  1075. Encode a parameter in headers likes @code{Content-Type} and
  1076. @code{Content-Disposition}.
  1077. @end table
  1078. @node ietf-drums
  1079. @section ietf-drums
  1080. @dfn{drums} is an IETF working group that is working on the replacement
  1081. for RFC822.
  1082. The functions provided by this library include:
  1083. @table @code
  1084. @item ietf-drums-remove-comments
  1085. @findex ietf-drums-remove-comments
  1086. Remove the comments from the argument and return the results.
  1087. @item ietf-drums-remove-whitespace
  1088. @findex ietf-drums-remove-whitespace
  1089. Remove linear white space from the string and return the results.
  1090. Spaces inside quoted strings and comments are left untouched.
  1091. @item ietf-drums-get-comment
  1092. @findex ietf-drums-get-comment
  1093. Return the last most comment from the string.
  1094. @item ietf-drums-parse-address
  1095. @findex ietf-drums-parse-address
  1096. Parse an address string and return a list that contains the mailbox and
  1097. the plain text name.
  1098. @item ietf-drums-parse-addresses
  1099. @findex ietf-drums-parse-addresses
  1100. Parse a string that contains any number of comma-separated addresses and
  1101. return a list that contains mailbox/plain text pairs.
  1102. @item ietf-drums-parse-date
  1103. @findex ietf-drums-parse-date
  1104. Parse a date string and return an Emacs time structure.
  1105. @item ietf-drums-narrow-to-header
  1106. @findex ietf-drums-narrow-to-header
  1107. Narrow the buffer to the header section of the current buffer.
  1108. @end table
  1109. @node rfc2047
  1110. @section rfc2047
  1111. RFC2047 (Message Header Extensions for Non-@acronym{ASCII} Text) specifies how
  1112. non-@acronym{ASCII} text in headers are to be encoded. This is actually rather
  1113. complicated, so a number of variables are necessary to tweak what this
  1114. library does.
  1115. The following variables are tweakable:
  1116. @table @code
  1117. @item rfc2047-header-encoding-alist
  1118. @vindex rfc2047-header-encoding-alist
  1119. This is an alist of header / encoding-type pairs. Its main purpose is
  1120. to prevent encoding of certain headers.
  1121. The keys can either be header regexps, or @code{t}.
  1122. The values can be @code{nil}, in which case the header(s) in question
  1123. won't be encoded, @code{mime}, which means that they will be encoded, or
  1124. @code{address-mime}, which means the header(s) will be encoded carefully
  1125. assuming they contain addresses.
  1126. @item rfc2047-charset-encoding-alist
  1127. @vindex rfc2047-charset-encoding-alist
  1128. RFC2047 specifies two forms of encoding---@code{Q} (a
  1129. Quoted-Printable-like encoding) and @code{B} (base64). This alist
  1130. specifies which charset should use which encoding.
  1131. @item rfc2047-encode-function-alist
  1132. @vindex rfc2047-encode-function-alist
  1133. This is an alist of encoding / function pairs. The encodings are
  1134. @code{Q}, @code{B} and @code{nil}.
  1135. @item rfc2047-encoded-word-regexp
  1136. @vindex rfc2047-encoded-word-regexp
  1137. When decoding words, this library looks for matches to this regexp.
  1138. @item rfc2047-encoded-word-regexp-loose
  1139. @vindex rfc2047-encoded-word-regexp-loose
  1140. This is a version from which the regexp for the Q encoding pattern of
  1141. @code{rfc2047-encoded-word-regexp} is made loose.
  1142. @item rfc2047-encode-encoded-words
  1143. @vindex rfc2047-encode-encoded-words
  1144. The boolean variable specifies whether encoded words
  1145. (e.g., @samp{=?us-ascii?q?hello?=}) should be encoded again.
  1146. @code{rfc2047-encoded-word-regexp} is used to look for such words.
  1147. @item rfc2047-allow-irregular-q-encoded-words
  1148. @vindex rfc2047-allow-irregular-q-encoded-words
  1149. The boolean variable specifies whether irregular Q encoded words
  1150. (e.g., @samp{=?us-ascii?q?hello??=}) should be decoded. If it is
  1151. non-@code{nil}, @code{rfc2047-encoded-word-regexp-loose} is used instead
  1152. of @code{rfc2047-encoded-word-regexp} to look for encoded words.
  1153. @end table
  1154. Those were the variables, and these are this functions:
  1155. @table @code
  1156. @item rfc2047-narrow-to-field
  1157. @findex rfc2047-narrow-to-field
  1158. Narrow the buffer to the header on the current line.
  1159. @item rfc2047-encode-message-header
  1160. @findex rfc2047-encode-message-header
  1161. Should be called narrowed to the header of a message. Encodes according
  1162. to @code{rfc2047-header-encoding-alist}.
  1163. @item rfc2047-encode-region
  1164. @findex rfc2047-encode-region
  1165. Encodes all encodable words in the region specified.
  1166. @item rfc2047-encode-string
  1167. @findex rfc2047-encode-string
  1168. Encode a string and return the results.
  1169. @item rfc2047-decode-region
  1170. @findex rfc2047-decode-region
  1171. Decode the encoded words in the region.
  1172. @item rfc2047-decode-string
  1173. @findex rfc2047-decode-string
  1174. Decode a string and return the results.
  1175. @item rfc2047-encode-parameter
  1176. @findex rfc2047-encode-parameter
  1177. Encode a parameter in the RFC2047-like style. This is a substitution
  1178. for the @code{rfc2231-encode-string} function, that is the standard but
  1179. many mailers don't support it. @xref{rfc2231}.
  1180. @end table
  1181. @node time-date
  1182. @section time-date
  1183. While not really a part of the @acronym{MIME} library, it is convenient to
  1184. document this library here. It deals with parsing @code{Date} headers
  1185. and manipulating time. (Not by using tesseracts, though, I'm sorry to
  1186. say.)
  1187. These functions convert between five formats: A date string, an Emacs
  1188. time structure, a decoded time list, a second number, and a day number.
  1189. Here's a bunch of time/date/second/day examples:
  1190. @example
  1191. (parse-time-string "Sat Sep 12 12:21:54 1998 +0200")
  1192. @result{} (54 21 12 12 9 1998 6 nil 7200)
  1193. (date-to-time "Sat Sep 12 12:21:54 1998 +0200")
  1194. @result{} (13818 19266)
  1195. (float-time '(13818 19266))
  1196. @result{} 905595714.0
  1197. (seconds-to-time 905595714.0)
  1198. @result{} (13818 19266 0 0)
  1199. (time-to-days '(13818 19266))
  1200. @result{} 729644
  1201. (days-to-time 729644)
  1202. @result{} (961933 512)
  1203. (time-since '(13818 19266))
  1204. @result{} (6797 9607 984839 247000)
  1205. (time-less-p '(13818 19266) '(13818 19145))
  1206. @result{} nil
  1207. (subtract-time '(13818 19266) '(13818 19145))
  1208. @result{} (0 121)
  1209. (days-between "Sat Sep 12 12:21:54 1998 +0200"
  1210. "Sat Sep 07 12:21:54 1998 +0200")
  1211. @result{} 5
  1212. (date-leap-year-p 2000)
  1213. @result{} t
  1214. (time-to-day-in-year '(13818 19266))
  1215. @result{} 255
  1216. (time-to-number-of-days
  1217. (time-since
  1218. (date-to-time "Mon, 01 Jan 2001 02:22:26 GMT")))
  1219. @result{} 4314.095589286675
  1220. @end example
  1221. And finally, we have @code{safe-date-to-time}, which does the same as
  1222. @code{date-to-time}, but returns a zero time if the date is
  1223. syntactically malformed.
  1224. The five data representations used are the following:
  1225. @table @var
  1226. @item date
  1227. An RFC822 (or similar) date string. For instance: @code{"Sat Sep 12
  1228. 12:21:54 1998 +0200"}.
  1229. @item time
  1230. An internal Emacs time. For instance: @code{(13818 26466 0 0)}.
  1231. @item seconds
  1232. A floating point representation of the internal Emacs time. For
  1233. instance: @code{905595714.0}.
  1234. @item days
  1235. An integer number representing the number of days since 00000101. For
  1236. instance: @code{729644}.
  1237. @item decoded time
  1238. A list of decoded time. For instance: @code{(54 21 12 12 9 1998 6 t
  1239. 7200)}.
  1240. @end table
  1241. All the examples above represent the same moment.
  1242. These are the functions available:
  1243. @table @code
  1244. @item date-to-time
  1245. Take a date and return a time.
  1246. @item float-time
  1247. Take a time and return seconds. (This is a built-in function.)
  1248. @item seconds-to-time
  1249. Take seconds and return a time.
  1250. @item time-to-days
  1251. Take a time and return days.
  1252. @item days-to-time
  1253. Take days and return a time.
  1254. @item date-to-day
  1255. Take a date and return days.
  1256. @item time-to-number-of-days
  1257. Take a time and return the number of days that represents.
  1258. @item safe-date-to-time
  1259. Take a date and return a time. If the date is not syntactically valid,
  1260. return a ``zero'' time.
  1261. @item time-less-p
  1262. Take two times and say whether the first time is less (i.e., earlier)
  1263. than the second time.
  1264. @item time-since
  1265. Take a time and return a time saying how long it was since that time.
  1266. @item subtract-time
  1267. Take two times and subtract the second from the first. I.e., return
  1268. the time between the two times.
  1269. @item days-between
  1270. Take two days and return the number of days between those two days.
  1271. @item date-leap-year-p
  1272. Take a year number and say whether it's a leap year.
  1273. @item time-to-day-in-year
  1274. Take a time and return the day number within the year that the time is
  1275. in.
  1276. @end table
  1277. @node qp
  1278. @section qp
  1279. This library deals with decoding and encoding Quoted-Printable text.
  1280. Very briefly explained, qp encoding means translating all 8-bit
  1281. characters (and lots of control characters) into things that look like
  1282. @samp{=EF}; that is, an equal sign followed by the byte encoded as a hex
  1283. string.
  1284. The following functions are defined by the library:
  1285. @table @code
  1286. @item quoted-printable-decode-region
  1287. @findex quoted-printable-decode-region
  1288. QP-decode all the encoded text in the specified region.
  1289. @item quoted-printable-decode-string
  1290. @findex quoted-printable-decode-string
  1291. Decode the QP-encoded text in a string and return the results.
  1292. @item quoted-printable-encode-region
  1293. @findex quoted-printable-encode-region
  1294. QP-encode all the encodable characters in the specified region. The third
  1295. optional parameter @var{fold} specifies whether to fold long lines.
  1296. (Long here means 72.)
  1297. @item quoted-printable-encode-string
  1298. @findex quoted-printable-encode-string
  1299. QP-encode all the encodable characters in a string and return the
  1300. results.
  1301. @end table
  1302. @node base64
  1303. @section base64
  1304. @cindex base64
  1305. Base64 is an encoding that encodes three bytes into four characters,
  1306. thereby increasing the size by about 33%. The alphabet used for
  1307. encoding is very resistant to mangling during transit.
  1308. The following functions are defined by this library:
  1309. @table @code
  1310. @item base64-encode-region
  1311. @findex base64-encode-region
  1312. base64 encode the selected region. Return the length of the encoded
  1313. text. Optional third argument @var{no-line-break} means do not break
  1314. long lines into shorter lines.
  1315. @item base64-encode-string
  1316. @findex base64-encode-string
  1317. base64 encode a string and return the result.
  1318. @item base64-decode-region
  1319. @findex base64-decode-region
  1320. base64 decode the selected region. Return the length of the decoded
  1321. text. If the region can't be decoded, return @code{nil} and don't
  1322. modify the buffer.
  1323. @item base64-decode-string
  1324. @findex base64-decode-string
  1325. base64 decode a string and return the result. If the string can't be
  1326. decoded, @code{nil} is returned.
  1327. @end table
  1328. @node binhex
  1329. @section binhex
  1330. @cindex binhex
  1331. @cindex Apple
  1332. @cindex Macintosh
  1333. @code{binhex} is an encoding that originated in Macintosh environments.
  1334. The following function is supplied to deal with these:
  1335. @table @code
  1336. @item binhex-decode-region
  1337. @findex binhex-decode-region
  1338. Decode the encoded text in the region. If given a third parameter, only
  1339. decode the @code{binhex} header and return the filename.
  1340. @end table
  1341. @node uudecode
  1342. @section uudecode
  1343. @cindex uuencode
  1344. @cindex uudecode
  1345. @code{uuencode} is probably still the most popular encoding of binaries
  1346. used on Usenet, although @code{base64} rules the mail world.
  1347. The following function is supplied by this package:
  1348. @table @code
  1349. @item uudecode-decode-region
  1350. @findex uudecode-decode-region
  1351. Decode the text in the region.
  1352. @end table
  1353. @node yenc
  1354. @section yenc
  1355. @cindex yenc
  1356. @code{yenc} is used for encoding binaries on Usenet. The following
  1357. function is supplied by this package:
  1358. @table @code
  1359. @item yenc-decode-region
  1360. @findex yenc-decode-region
  1361. Decode the encoded text in the region.
  1362. @end table
  1363. @node rfc1843
  1364. @section rfc1843
  1365. @cindex rfc1843
  1366. @cindex HZ
  1367. @cindex Chinese
  1368. RFC1843 deals with mixing Chinese and @acronym{ASCII} characters in messages. In
  1369. essence, RFC1843 switches between @acronym{ASCII} and Chinese by doing this:
  1370. @example
  1371. This sentence is in @acronym{ASCII}.
  1372. The next sentence is in GB.~@{<:Ky2;S@{#,NpJ)l6HK!#~@}Bye.
  1373. @end example
  1374. Simple enough, and widely used in China.
  1375. The following functions are available to handle this encoding:
  1376. @table @code
  1377. @item rfc1843-decode-region
  1378. Decode HZ-encoded text in the region.
  1379. @item rfc1843-decode-string
  1380. Decode a HZ-encoded string and return the result.
  1381. @end table
  1382. @node mailcap
  1383. @section mailcap
  1384. The @file{~/.mailcap} file is parsed by most @acronym{MIME}-aware message
  1385. handlers and describes how elements are supposed to be displayed.
  1386. Here's an example file:
  1387. @example
  1388. image/*; gimp -8 %s
  1389. audio/wav; wavplayer %s
  1390. application/msword; catdoc %s ; copiousoutput ; nametemplate=%s.doc
  1391. @end example
  1392. This says that all image files should be displayed with @code{gimp},
  1393. that WAVE audio files should be played by @code{wavplayer}, and that
  1394. MS-WORD files should be inlined by @code{catdoc}.
  1395. The @code{mailcap} library parses this file, and provides functions for
  1396. matching types.
  1397. @table @code
  1398. @item mailcap-mime-data
  1399. @vindex mailcap-mime-data
  1400. This variable is an alist of alists containing backup viewing rules.
  1401. @item mailcap-user-mime-data
  1402. @vindex mailcap-user-mime-data
  1403. A customizable list of viewers that take preference over
  1404. @code{mailcap-mime-data}.
  1405. @end table
  1406. Interface functions:
  1407. @table @code
  1408. @item mailcap-parse-mailcaps
  1409. @findex mailcap-parse-mailcaps
  1410. Parse the @file{~/.mailcap} file.
  1411. @item mailcap-mime-info
  1412. Takes a @acronym{MIME} type as its argument and returns the matching viewer.
  1413. @end table
  1414. @node Standards
  1415. @chapter Standards
  1416. The Emacs @acronym{MIME} library implements handling of various elements
  1417. according to a (somewhat) large number of RFCs, drafts and standards
  1418. documents. This chapter lists the relevant ones. They can all be
  1419. fetched from @uref{http://quimby.gnus.org/notes/}.
  1420. @table @dfn
  1421. @item RFC822
  1422. @itemx STD11
  1423. Standard for the Format of ARPA Internet Text Messages.
  1424. @item RFC1036
  1425. Standard for Interchange of USENET Messages
  1426. @item RFC2045
  1427. Format of Internet Message Bodies
  1428. @item RFC2046
  1429. Media Types
  1430. @item RFC2047
  1431. Message Header Extensions for Non-@acronym{ASCII} Text
  1432. @item RFC2048
  1433. Registration Procedures
  1434. @item RFC2049
  1435. Conformance Criteria and Examples
  1436. @item RFC2231
  1437. @acronym{MIME} Parameter Value and Encoded Word Extensions: Character Sets,
  1438. Languages, and Continuations
  1439. @item RFC1843
  1440. HZ---A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and
  1441. @acronym{ASCII} characters
  1442. @item draft-ietf-drums-msg-fmt-05.txt
  1443. Draft for the successor of RFC822
  1444. @item RFC2112
  1445. The @acronym{MIME} Multipart/Related Content-type
  1446. @item RFC1892
  1447. The Multipart/Report Content Type for the Reporting of Mail System
  1448. Administrative Messages
  1449. @item RFC2183
  1450. Communicating Presentation Information in Internet Messages: The
  1451. Content-Disposition Header Field
  1452. @item RFC2646
  1453. Documentation of the text/plain format parameter for flowed text.
  1454. @end table
  1455. @node GNU Free Documentation License
  1456. @chapter GNU Free Documentation License
  1457. @include doclicense.texi
  1458. @node Index
  1459. @chapter Index
  1460. @printindex cp
  1461. @bye
  1462. @c Local Variables:
  1463. @c mode: texinfo
  1464. @c coding: utf-8
  1465. @c End: