|
- \input texinfo
- @setfilename ../../info/url.info
- @settitle URL Programmer's Manual
- @include docstyle.texi
- @iftex
- @c @finalout
- @end iftex
- @c @setchapternewpage odd
- @c @smallbook
- @tex
- \overfullrule=0pt
- %\global\baselineskip 30pt % for printing in double space
- @end tex
- @dircategory Emacs lisp libraries
- @direntry
- * URL: (url). URL loading package.
- @end direntry
- @copying
- This is the manual for the @code{url} Emacs Lisp library.
- Copyright @copyright{} 1993--1999, 2002, 2004--2017 Free Software
- Foundation, Inc.
- @quotation
- Permission is granted to copy, distribute and/or modify this document
- under the terms of the GNU Free Documentation License, Version 1.3 or
- any later version published by the Free Software Foundation; with no
- Invariant Sections, with the Front-Cover Texts being ``A GNU Manual,''
- and with the Back-Cover Texts as in (a) below. A copy of the license
- is included in the section entitled ``GNU Free Documentation License''.
- (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
- modify this GNU manual.''
- @end quotation
- @end copying
- @c
- @titlepage
- @title URL Programmer's Manual
- @subtitle First Edition, URL Version 2.0
- @author William M. Perry @email{wmperry@@gnu.org}
- @author David Love @email{fx@@gnu.org}
- @page
- @vskip 0pt plus 1filll
- @insertcopying
- @end titlepage
- @contents
- @node Top
- @top URL
- @ifnottex
- @insertcopying
- @end ifnottex
- @menu
- * Introduction:: About the @code{url} library.
- * URI Parsing:: Parsing (and unparsing) URIs.
- * Retrieving URLs:: How to use this package to retrieve a URL.
- * Supported URL Types:: Descriptions of URL types currently supported.
- * General Facilities:: URLs can be cached, accessed via a gateway
- and tracked in a history list.
- * Customization:: Variables you can alter.
- * GNU Free Documentation License:: The license for this documentation.
- * Function Index::
- * Variable Index::
- * Concept Index::
- @end menu
- @node Introduction
- @chapter Introduction
- @cindex URL
- @cindex URI
- @cindex uniform resource identifier
- @cindex uniform resource locator
- A @dfn{Uniform Resource Identifier} (URI) is a specially-formatted
- name, such as an Internet address, that identifies some name or
- resource. The format of URIs is described in RFC 3986, which updates
- and replaces the earlier RFCs 2732, 2396, 1808, and 1738. A
- @dfn{Uniform Resource Locator} (URL) is an older but still-common
- term, which basically refers to a URI corresponding to a resource that
- can be accessed (usually over a network) in a specific way.
- Here are some examples of URIs (taken from RFC 3986):
- @example
- ftp://ftp.is.co.za/rfc/rfc1808.txt
- http://www.ietf.org/rfc/rfc2396.txt
- ldap://[2001:db8::7]/c=GB?objectClass?one
- mailto:John.Doe@@example.com
- news:comp.infosystems.www.servers.unix
- tel:+1-816-555-1212
- telnet://192.0.2.16:80/
- urn:oasis:names:specification:docbook:dtd:xml:4.1.2
- @end example
- This manual describes the @code{url} library, an Emacs Lisp library
- for parsing URIs and retrieving the resources to which they refer.
- (The library is so-named for historical reasons; nowadays, the ``URI''
- terminology is regarded as the more general one, and ``URL'' is
- technically obsolete despite its widespread vernacular usage.)
- @node URI Parsing
- @chapter URI Parsing
- A URI consists of several @dfn{components}, each having a different
- meaning. For example, the URI
- @example
- http://www.gnu.org/software/emacs/
- @end example
- @noindent
- specifies the scheme component @samp{http}, the hostname component
- @samp{www.gnu.org}, and the path component @samp{/software/emacs/}.
- @cindex parsed URIs
- The format of URIs is specified by RFC 3986. The @code{url} library
- provides the Lisp function @code{url-generic-parse-url}, a (mostly)
- standard-compliant URI parser, as well as function
- @code{url-recreate-url}, which converts a parsed URI back into a URI
- string.
- @defun url-generic-parse-url uri-string
- This function returns a parsed version of the string @var{uri-string}.
- @end defun
- @defun url-recreate-url uri-obj
- @cindex unparsing URLs
- Given a parsed URI, this function returns the corresponding URI string.
- @end defun
- @cindex parsed URI
- The return value of @code{url-generic-parse-url}, and the argument
- expected by @code{url-recreate-url}, is a @dfn{parsed URI}: a CL
- structure whose slots hold the various components of the URI@.
- @xref{Top,the CL Manual,,cl,GNU Emacs Common Lisp Emulation}, for
- details about CL structures. Most of the other functions in the
- @code{url} library act on parsed URIs.
- @menu
- * Parsed URIs:: Format of parsed URI structures.
- * URI Encoding:: Non-@acronym{ASCII} characters in URIs.
- @end menu
- @node Parsed URIs
- @section Parsed URI structures
- Each parsed URI structure contains the following slots:
- @table @code
- @item type
- The URI scheme (a string, e.g., @code{http}). @xref{Supported URL
- Types}, for a list of schemes that the @code{url} library knows how to
- process. This slot can also be @code{nil}, if the URI is not fully
- specified.
- @item user
- The user name (a string), or @code{nil}.
- @item password
- The user password (a string), or @code{nil}. The use of this URI
- component is strongly discouraged; nowadays, passwords are transmitted
- by other means, not as part of a URI.
- @item host
- The host name (a string), or @code{nil}. If present, this is
- typically a domain name or IP address.
- @item port
- The port number (an integer), or @code{nil}. Omitting this component
- usually means to use the ``standard'' port associated with the URI
- scheme.
- @item filename
- The combination of the ``path'' and ``query'' components of the URI (a
- string), or @code{nil}. If the query component is present, it is the
- substring following the first @samp{?} character, and the path
- component is the substring before the @samp{?}. The meaning of these
- components is scheme-dependent; they do not necessarily refer to a
- file on a disk.
- @item target
- The fragment component (a string), or @code{nil}. The fragment
- component specifies a ``secondary resource'', such as a section of a
- webpage.
- @item fullness
- This is @code{t} if the URI is fully specified, i.e., the
- hierarchical components of the URI (the hostname and/or username
- and/or password) are preceded by @samp{//}.
- @end table
- @findex url-type
- @findex url-user
- @findex url-password
- @findex url-host
- @findex url-port
- @findex url-filename
- @findex url-target
- @findex url-attributes
- @findex url-fullness
- These slots have accessors named @code{url-@var{part}}, where
- @var{part} is the slot name. For example, the accessor for the
- @code{host} slot is the function @code{url-host}. The @code{url-port}
- accessor returns the default port for the URI scheme if the parsed
- URI's @var{port} slot is @code{nil}.
- The slots can be set using @code{setf}. For example:
- @example
- (setf (url-port url) 80)
- @end example
- @node URI Encoding
- @section URI Encoding
- @cindex percent encoding
- The @code{url-generic-parse-url} parser does not obey RFC 3986 in
- one respect: it allows non-@acronym{ASCII} characters in URI strings.
- Strictly speaking, RFC 3986 compatible URIs may only consist of
- @acronym{ASCII} characters; non-@acronym{ASCII} characters are
- represented by converting them to UTF-8 byte sequences, and performing
- @dfn{percent encoding} on the bytes. For example, the o-umlaut
- character is converted to the UTF-8 byte sequence @samp{\xD3\xA7},
- then percent encoded to @samp{%D3%A7}. (Certain ``reserved''
- @acronym{ASCII} characters must also be percent encoded when they
- appear in URI components.)
- The function @code{url-encode-url} can be used to convert a URI
- string containing arbitrary characters to one that is properly
- percent-encoded in accordance with RFC 3986.
- @defun url-encode-url url-string
- This function return a properly URI-encoded version of
- @var{url-string}. It also performs @dfn{URI normalization},
- e.g., converting the scheme component to lowercase if it was
- previously uppercase.
- @end defun
- To convert between a string containing arbitrary characters and a
- percent-encoded all-@acronym{ASCII} string, use the functions
- @code{url-hexify-string} and @code{url-unhex-string}:
- @defun url-hexify-string string &optional allowed-chars
- This function performs percent-encoding on @var{string}, and returns
- the result.
- If @var{string} is multibyte, it is first converted to a UTF-8 byte
- string. Each byte corresponding to an allowed character is left
- as-is, while all other bytes are converted to a three-character
- sequence: @samp{%} followed by two upper-case hex digits.
- @vindex url-unreserved-chars
- @cindex unreserved characters
- The allowed characters are specified by @var{allowed-chars}. If this
- argument is @code{nil}, the allowed characters are those specified as
- @dfn{unreserved characters} by RFC 3986 (see the variable
- @code{url-unreserved-chars}). Otherwise, @var{allowed-chars} should
- be a vector whose @var{n}-th element is non-@code{nil} if character
- @var{n} is allowed.
- @end defun
- @defun url-unhex-string string &optional allow-newlines
- This function replaces percent-encoding sequences in @var{string} with
- their character equivalents, and returns the resulting string.
- If @var{allow-newlines} is non-@code{nil}, it allows the decoding of
- carriage returns and line feeds, which are normally forbidden in URIs.
- @end defun
- @node Retrieving URLs
- @chapter Retrieving URLs
- The @code{url} library defines the following three functions for
- retrieving the data specified by a URL@. The actual retrieval protocol
- depends on the URL's URI scheme, and is performed by lower-level
- scheme-specific functions. (Those lower-level functions are not
- documented here, and generally should not be called directly.)
- In each of these functions, the @var{url} argument can be either a
- string or a parsed URL structure. If it is a string, that string is
- passed through @code{url-encode-url} before using it, to ensure that
- it is properly URI-encoded (@pxref{URI Encoding}).
- @defun url-retrieve-synchronously url &optional silent no-cookies timeout
- This function synchronously retrieves the data specified by @var{url},
- and returns a buffer containing the data. The return value is
- @code{nil} if there is no data associated with the URL (as is the case
- for @code{dired}, @code{info}, and @code{mailto} URLs).
- If the optional argument @var{silent} is non-@code{nil}, progress
- messages are suppressed. If the optional argument @var{no-cookies} is
- non-@code{nil}, cookies are not stored or sent. If the optional
- argument @var{timeout} is non-@code{nil}, it should be a number that
- says (in seconds) how long to wait for a response before giving up.
- @end defun
- @defun url-retrieve url callback &optional cbargs silent no-cookies
- This function retrieves @var{url} asynchronously, calling the function
- @var{callback} when the object has been completely retrieved. The
- return value is the buffer into which the data will be inserted, or
- @code{nil} if the process has already completed.
- The callback function is called this way:
- @example
- (apply @var{callback} @var{status} @var{cbargs})
- @end example
- @noindent
- where @var{status} is a plist representing what happened during the
- retrieval, with most recent events first, or an empty list if no
- events have occurred. Each pair in the plist is one of:
- @table @code
- @item (:redirect @var{redirected-to})
- This means that the request was redirected to the URL
- @var{redirected-to}.
- @item (:error (@var{error-symbol} . @var{data}))
- This means that an error occurred. If so desired, the error can be
- signaled with @code{(signal @var{error-symbol} @var{data})}.
- @end table
- When the callback function is called, the current buffer is the one
- containing the retrieved data (if any). The buffer also contains any
- MIME headers associated with the data retrieval.
- If the optional argument @var{silent} is non-@code{nil}, progress
- messages are suppressed. If the optional argument @var{no-cookies} is
- non-@code{nil}, cookies are not stored or sent.
- @end defun
- @defun url-queue-retrieve url callback &optional cbargs silent no-cookies
- This function acts like @code{url-retrieve}, but with limits on the
- number of concurrently-running network processes. The option
- @code{url-queue-parallel-processes} controls the number of concurrent
- processes, and the option @code{url-queue-timeout} sets a timeout in
- seconds.
- To use this function, you must @code{(require 'url-queue)}.
- @end defun
- @vindex url-queue-parallel-processes
- @defopt url-queue-parallel-processes
- The value of this option is an integer specifying the maximum number
- of concurrent @code{url-queue-retrieve} network processes. If the
- number of @code{url-queue-retrieve} calls is larger than this number,
- later ones are queued until earlier ones are finished.
- @end defopt
- @vindex url-queue-timeout
- @defopt url-queue-timeout
- The value of this option is a number specifying the maximum lifetime
- of a @code{url-queue-retrieve} network process, once it is started.
- If a process is not finished by then, it is killed and removed from
- the queue.
- @end defopt
- @node Supported URL Types
- @chapter Supported URL Types
- This chapter describes functions and variables affecting URL retrieval
- for specific schemes.
- @menu
- * http/https:: Hypertext Transfer Protocol.
- * file/ftp:: Local files and FTP archives.
- * info:: Emacs "Info" pages.
- * mailto:: Sending email.
- * news/nntp/snews:: Usenet news.
- * rlogin/telnet/tn3270:: Remote host connectivity.
- * irc:: Internet Relay Chat.
- * data:: Embedded data URLs.
- * nfs:: Networked File System.
- * ldap:: Lightweight Directory Access Protocol.
- * man:: Unix man pages.
- * Tramp:: Schemes supported via Tramp.
- @end menu
- @node http/https
- @section @code{http} and @code{https}
- The @code{http} scheme refers to the Hypertext Transfer Protocol. The
- @code{url} library supports HTTP version 1.1, specified in RFC 2616.
- Its default port is 80.
- The @code{https} scheme is a secure version of @code{http}, with
- transmission via SSL@. It is defined in RFC 2069, and its default port
- is 443. When using @code{https}, the @code{url} library performs SSL
- encryption via the @code{ssl} library, by forcing the @code{ssl}
- gateway method to be used. @xref{Gateways in general}.
- @defopt url-honor-refresh-requests
- If this option is non-@code{nil} (the default), the @code{url} library
- honors the HTTP @samp{Refresh} header, which is used by servers to
- direct clients to reload documents from the same URL or a or different
- one. If the value is @code{nil}, the @samp{Refresh} header is
- ignored; any other value means to ask the user on each request.
- @end defopt
- @menu
- * Cookies::
- * HTTP language/coding::
- * HTTP URL Options::
- * Dealing with HTTP documents::
- @end menu
- @node Cookies
- @subsection Cookies
- @findex url-cookie-delete
- @defun url-cookie-list
- This command creates a @file{*url cookies*} buffer listing the current
- cookies, if there are any. You can remove a cookie using the
- @kbd{C-k} (@code{url-cookie-delete}) command.
- @end defun
- @defun url-cookie-delete-cookies &optional regexp
- This function takes a regular expression as its parameters and deletes
- all cookies from that domain. If @var{regexp} is @code{nil}, delete
- all cookies.
- @end defun
- @defopt url-cookie-file
- The file in which cookies are stored, defaulting to @file{cookies} in
- the directory specified by @code{url-configuration-directory}.
- @end defopt
- @defopt url-cookie-confirmation
- Specifies whether confirmation is required to accept cookies.
- @end defopt
- @defopt url-cookie-multiple-line
- Specifies whether to put all cookies for the server on one line in the
- HTTP request to satisfy broken servers like
- @url{http://www.hotmail.com}.
- @end defopt
- @defopt url-cookie-trusted-urls
- A list of regular expressions matching URLs from which to accept
- cookies always.
- @end defopt
- @defopt url-cookie-untrusted-urls
- A list of regular expressions matching URLs from which to reject
- cookies always.
- @end defopt
- @defopt url-cookie-save-interval
- The number of seconds between automatic saves of cookies to disk.
- Default is one hour.
- @end defopt
- @node HTTP language/coding
- @subsection Language and Encoding Preferences
- HTTP allows clients to express preferences for the language and
- encoding of documents which servers may honor. For each of these
- variables, the value is a string; it can specify a single choice, or
- it can be a comma-separated list.
- Normally, this list is ordered by descending preference. However, each
- element can be followed by @samp{;q=@var{priority}} to specify its
- preference level, a decimal number from 0 to 1; e.g., for
- @code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
- en;q=0.7"}}. An element that has no @samp{;q} specification has
- preference level 1.
- @defopt url-mime-charset-string
- @cindex character sets
- @cindex coding systems
- This variable specifies a preference for character sets when documents
- can be served in more than one encoding.
- HTTP allows specifying a series of MIME charsets which indicate your
- preferred character set encodings, e.g., Latin-9 or Big5, and these
- can be weighted. The default series is generated automatically from
- the associated MIME types of all defined coding systems, sorted by the
- coding system priority specified in Emacs. @xref{Recognize Coding, ,
- Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
- @end defopt
- @defopt url-mime-language-string
- @cindex language preferences
- A string specifying the preferred language when servers can serve
- files in several languages. Use RFC 1766 abbreviations, e.g.,
- @samp{en} for English, @samp{de} for German.
- The string can be @code{"*"} to get the first available language (as
- opposed to the default).
- @end defopt
- @node HTTP URL Options
- @subsection HTTP URL Options
- HTTP supports an @samp{OPTIONS} method describing things supported by
- the URL@.
- @defun url-http-options url
- Returns a property list describing options available for URL@. The
- property list members are:
- @table @code
- @item methods
- A list of symbols specifying what HTTP methods the resource
- supports.
- @item dav
- @cindex DAV
- A list of numbers specifying what DAV protocol/schema versions are
- supported.
- @item dasl
- @cindex DASL
- A list of supported DASL search types supported (string form).
- @item ranges
- A list of the units available for use in partial document fetches.
- @item p3p
- @cindex P3P
- The @dfn{Platform For Privacy Protection} description for the resource.
- Currently this is just the raw header contents.
- @end table
- @end defun
- @node Dealing with HTTP documents
- @subsection Dealing with HTTP documents
- HTTP URLs are retrieved into a buffer containing the HTTP headers
- followed by the body. Since the headers are quasi-MIME, they may be
- processed using the MIME library. @xref{Top,, Emacs MIME,
- emacs-mime, The Emacs MIME Manual}.
- @node file/ftp
- @section file and ftp
- @cindex files
- @cindex FTP
- @cindex File Transfer Protocol
- @cindex compressed files
- @cindex dired
- The @code{ftp} and @code{file} schemes are defined in RFC 1808. The
- @code{url} library treats @samp{ftp:} and @samp{file:} as synonymous.
- Such URLs have the form
- @example
- ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
- file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
- @end example
- @noindent
- If the URL specifies a local file, it is retrieved by reading the file
- contents in the usual way. If it specifies a remote file, it is
- retrieved using either the Tramp or the Ange-FTP package.
- @xref{Remote Files,,, emacs, The GNU Emacs Manual}.
- When retrieving a compressed file, it is automatically uncompressed
- if it has the file suffix @file{.z}, @file{.gz}, @file{.Z},
- @file{.bz2}, or @file{.xz}. (The list of supported suffixes is
- hard-coded, and cannot be altered by customizing
- @code{jka-compr-compression-info-list}.)
- @defopt url-directory-index-file
- This option specifies the filename to look for when a @code{file} or
- @code{ftp} URL specifies a directory. The default is
- @file{index.html}. If this file exists and is readable, it is viewed.
- Otherwise, Emacs visits the directory using Dired.
- @end defopt
- @node info
- @section info
- @cindex Info
- @cindex Texinfo
- @findex Info-goto-node
- The @code{info} scheme is non-standard. Such URLs have the form
- @example
- info:@var{file}#@var{node}
- @end example
- @noindent
- and are retrieved by invoking @code{Info-goto-node} with argument
- @samp{(@var{file})@var{node}}. If @samp{#@var{node}} is omitted, the
- @samp{Top} node is opened.
- @node mailto
- @section mailto
- @cindex mailto
- @cindex email
- A @code{mailto} URL specifies an email message to be sent to a given
- email address. For example, @samp{mailto:foo@@bar.com} specifies
- sending a message to @samp{foo@@bar.com}. The ``retrieval method''
- for such URLs is to open a mail composition buffer in which the
- appropriate content (e.g., the recipient address) has been filled in.
- As defined in RFC 6068, a @code{mailto} URL can have the form
- @example
- @samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
- @end example
- @noindent
- where an arbitrary number of @var{header}s can be added. If the
- @var{header} is @samp{body}, then @var{contents} is put in the message
- body; otherwise, a @var{header} header field is created with
- @var{contents} as its contents. Note that the @code{url} library does
- not perform any checking of @var{header} or @var{contents}, so you
- should check them before sending the message.
- @defopt url-mail-command
- @vindex mail-user-agent
- The value of this variable is the function called whenever url needs
- to send mail. This should normally be left its default, which is the
- standard mail-composition command @code{compose-mail}. @xref{Sending
- Mail,,, emacs, The GNU Emacs Manual}.
- @end defopt
- If the document containing the @code{mailto} URL itself possessed a
- known URL, Emacs automatically inserts an @samp{X-Url-From} header
- field into the mail buffer, specifying that URL.
- @node news/nntp/snews
- @section @code{news}, @code{nntp} and @code{snews}
- @cindex news
- @cindex network news
- @cindex usenet
- @cindex NNTP
- @cindex snews
- The @code{news}, @code{nntp}, and @code{snews} schemes, defined in RFC
- 1738, are used for reading Usenet newsgroups. For compatibility with
- non-standard-compliant news clients, the @code{url} library allows
- host and port fields to be included in @code{news} URLs, even though
- they are properly only allowed for @code{nntp} and @code{snews}.
- @code{news} and @code{nntp} URLs have the following form:
- @table @samp
- @item news:@var{newsgroup}
- Retrieves a list of messages in @var{newsgroup};
- @item news:@var{message-id}
- Retrieves the message with the given @var{message-id};
- @item news:*
- Retrieves a list of all available newsgroups;
- @item nntp://@var{host}:@var{port}/@var{newsgroup}
- @itemx nntp://@var{host}:@var{port}/@var{message-id}
- @itemx nntp://@var{host}:@var{port}/*
- Similar to the @samp{news} versions.
- @end table
- The default port for @code{nntp} (and @code{news}) is 119. The
- difference between an @code{nntp} URL and a @code{news} URL is that an
- @code{nttp} URL may specify an article by its number. The
- @samp{snews} scheme is the same as @samp{nntp}, except that it is
- tunneled through SSL and has default port 563.
- These URLs are retrieved via the Gnus package.
- @cindex environment variable
- @vindex NNTPSERVER
- @defopt url-news-server
- This variable specifies the default news server from which to fetch
- news, if no server was specified in the URL@. The default value,
- @code{nil}, means to use the server specified by the standard
- environment variable @samp{NNTPSERVER}, or @samp{news} if that
- environment variable is unset.
- @end defopt
- @node rlogin/telnet/tn3270
- @section rlogin, telnet and tn3270
- @cindex rlogin
- @cindex telnet
- @cindex tn3270
- @cindex terminal emulation
- @findex terminal-emulator
- These URL schemes are defined in RFC 1738, and are used for logging in
- via a terminal emulator. They have the form
- @example
- telnet://@var{user}:@var{password}@@@var{host}:@var{port}
- @end example
- @noindent
- but the @var{password} component is ignored. By default, the
- @code{telnet} scheme is handled via Tramp (@pxref{Tramp}).
- To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
- @code{telnet} or @code{tn3270} (the program names and arguments are
- hardcoded) session is run in a @code{terminal-emulator} buffer.
- Well-known ports are used if the URL does not specify a port.
- @node irc
- @section irc
- @cindex IRC
- @cindex Internet Relay Chat
- @cindex ZEN IRC
- @cindex ERC
- @cindex rcirc
- The @code{irc} scheme is defined in the Internet Draft at
- @url{http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt} (which
- was never approved as an RFC). Such URLs have the form
- @example
- irc://@var{host}:@var{port}/@var{target},@var{needpass}
- @end example
- @noindent
- and are retrieved by opening an @acronym{IRC} session using the
- function specified by @code{url-irc-function}.
- @defopt url-irc-function
- The value of this option is a function, which is called to open an IRC
- connection for @code{irc} URLs. This function must take five
- arguments, @var{host}, @var{port}, @var{channel}, @var{user} and
- @var{password}. The @var{channel} argument specifies the channel to
- join immediately, and may be @code{nil}.
- The default is @code{url-irc-rcirc}, which uses the Rcirc package.
- Other options are @code{url-irc-erc} (which uses ERC) and
- @code{url-irc-zenirc} (which uses ZenIRC).
- @end defopt
- @node data
- @section data
- @cindex data URLs
- The @code{data} scheme, defined in RFC 2397, contains MIME data in
- the URL itself. Such URLs have the form
- @example
- data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
- @end example
- @noindent
- @var{media-type} is a MIME @samp{Content-Type} string, possibly
- including parameters. It defaults to
- @samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
- omitted but the charset parameter supplied. If @samp{;base64} is
- present, the @var{data} are base64-encoded.
- @node nfs
- @section nfs
- @cindex NFS
- @cindex Network File System
- @cindex automounter
- The @code{nfs} scheme, defined in RFC 2224, is similar to @code{ftp}
- except that it points to a file on a remote host that is handled by an
- NFS automounter on the local host. Such URLs have the form
- @example
- nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
- @end example
- @defvar url-nfs-automounter-directory-spec
- @end defvar
- A string saying how to invoke the NFS automounter. Certain @samp{%}
- sequences are recognized:
- @table @samp
- @item %h
- The hostname of the NFS server;
- @item %n
- The port number of the NFS server;
- @item %u
- The username to use to authenticate;
- @item %p
- The password to use to authenticate;
- @item %f
- The filename on the remote server;
- @item %%
- A literal @samp{%}.
- @end table
- Each can be used any number of times.
- @node ldap
- @section ldap
- @cindex LDAP
- @cindex Lightweight Directory Access Protocol
- The LDAP scheme is defined in RFC 2255.
- @node man
- @section man
- @cindex @command{man}
- @cindex Unix man pages
- @findex man
- The @code{man} scheme is a non-standard one. Such URLs have the form
- @example
- @samp{man:@var{page-spec}}
- @end example
- @noindent
- and are retrieved by passing @var{page-spec} to the Lisp function
- @code{man}.
- @node Tramp
- @section URL Types Supported via Tramp
- @vindex url-tramp-protocols
- Some additional URL types are supported by passing them to Tramp
- (@pxref{Top, The Tramp Manual,, tramp, The Tramp Manual}). These
- protocols are listed in the @code{url-tramp-protocols} variable, which
- you can customize. The default value includes the following
- protocols:
- @table @code
- @item ftp
- The file transfer protocol. @xref{file/ftp}.
- @item ssh
- @cindex ssh
- The secure shell protocol. @xref{Inline methods,,, tramp, The Tramp
- Manual}.
- @item scp
- @cindex scp
- The secure file copy protocol. @xref{External methods,,, tramp, The
- Tramp Manual}.
- @item rsync
- @cindex rsync
- The remote sync protocol.
- @item telnet
- The telnet protocol.
- @end table
- @node General Facilities
- @chapter General Facilities
- @menu
- * Disk Caching::
- * Proxies::
- * Gateways in general::
- * History::
- @end menu
- @node Disk Caching
- @section Disk Caching
- @cindex Caching
- @cindex Persistent Cache
- @cindex Disk Cache
- The disk cache stores retrieved documents locally, whence they can be
- retrieved more quickly. When requesting a URL that is in the cache,
- the library checks to see if the page has changed since it was last
- retrieved from the remote machine. If not, the local copy is used,
- saving the transmission over the network.
- @cindex Cleaning the cache
- @cindex Clearing the cache
- @cindex Cache cleaning
- Currently the cache isn't cleared automatically.
- @c Running the @code{clean-cache} shell script
- @c fist is recommended, to allow for future cleaning of the cache. This
- @c shell script will remove all files that have not been accessed since it
- @c was last run. To keep the cache pared down, it is recommended that this
- @c script be run from @i{at} or @i{cron} (see the manual pages for
- @c crontab(5) or at(1) for more information)
- @defopt url-automatic-caching
- Setting this variable non-@code{nil} causes documents to be cached
- automatically.
- @end defopt
- @defopt url-cache-directory
- This variable specifies the
- directory to store the cache files. It defaults to sub-directory
- @file{cache} of @code{url-configuration-directory}.
- @end defopt
- @defopt url-cache-creation-function
- The cache relies on a scheme for mapping URLs to files in the cache.
- This variable names a function which sets the type of cache to use.
- It takes a URL as argument and returns the absolute file name of the
- corresponding cache file. The two supplied possibilities are
- @code{url-cache-create-filename-using-md5} and
- @code{url-cache-create-filename-human-readable}.
- @end defopt
- @defun url-cache-create-filename-using-md5 url
- Creates a cache file name from @var{url} using MD5 hashing.
- This is creates entries with very few cache collisions and is fast.
- @cindex MD5
- @smallexample
- (url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
- @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
- @end smallexample
- @end defun
- @defun url-cache-create-filename-human-readable url
- Creates a cache file name from @var{url} more obviously connected to
- @var{url} than for @code{url-cache-create-filename-using-md5}, but
- more likely to conflict with other files.
- @smallexample
- (url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
- @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
- @end smallexample
- @end defun
- @defun url-cache-expired
- This function returns non-@code{nil} if a cache entry has expired (or is absent).
- The arguments are a URL and optional expiration delay in seconds
- (default @var{url-cache-expire-time}).
- @end defun
- @defopt url-cache-expire-time
- This variable is the default number of seconds to use for the
- expire-time argument of the function @code{url-cache-expired}.
- @end defopt
- @defun url-fetch-from-cache
- This function takes a URL as its argument and returns a buffer
- containing the data cached for that URL.
- @end defun
- @c Fixme: never actually used currently?
- @c @defopt url-standalone-mode
- @c @cindex Relying on cache
- @c @cindex Cache only mode
- @c @cindex Standalone mode
- @c If this variable is non-@code{nil}, the library relies solely on the
- @c cache for fetching documents and avoids checking if they have changed
- @c on remote servers.
- @c @end defopt
- @c With a large cache of documents on the local disk, it can be very handy
- @c when traveling, or any other time the network connection is not active
- @c (a laptop with a dial-on-demand PPP connection, etc.). Emacs/W3 can rely
- @c solely on its cache, and avoid checking to see if the page has changed
- @c on the remote server. In the case of a dial-on-demand PPP connection,
- @c this will keep the phone line free as long as possible, only bringing up
- @c the PPP connection when asking for a page that is not located in the
- @c cache. This is very useful for demonstrations as well.
- @node Proxies
- @section Proxies and Gatewaying
- @c fixme: check/document url-ns stuff
- @cindex proxy servers
- @cindex proxies
- @cindex environment variables
- @vindex HTTP_PROXY
- Proxy servers are commonly used to provide gateways through firewalls
- or as caches serving some more-or-less local network. Each protocol
- (HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
- conventionally configured commonly amongst different programs through
- environment variables of the form @code{@var{protocol}_proxy}, where
- @var{protocol} is one of the supported network protocols (@code{http},
- @code{ftp} etc.). The library recognizes such variables in either
- upper or lower case. Their values are of one of the forms:
- @itemize @bullet
- @item @code{@var{host}:@var{port}}
- @item A full URL;
- @item Simply a host name.
- @end itemize
- @vindex NO_PROXY
- The @code{NO_PROXY} environment variable specifies URLs that should be
- excluded from proxying (on servers that should be contacted directly).
- This should be a comma-separated list of hostnames, domain names, or a
- mixture of both. Asterisks can be used as wildcards, but other
- clients may not support that. Domain names may be indicated by a
- leading dot. For example:
- @example
- NO_PROXY="*.aventail.com,home.com,.seanet.com"
- @end example
- @noindent says to contact all machines in the @samp{aventail.com} and
- @samp{seanet.com} domains directly, as well as the machine named
- @samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
- and @code{no_proxy} are also tried, in that order.
- Proxies may also be specified directly in Lisp.
- @defopt url-proxy-services
- This variable is an alist of URL schemes and proxy servers that
- gateway them. The items are of the form @w{@code{(@var{scheme}
- . @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
- gatewayed through @var{portnumber} on the specified @var{host}. An
- exception is the pseudo scheme @code{"no_proxy"}, which is paired with
- a regexp matching host names not to be proxied. This variable is
- initialized from the environment as above.
- @example
- (setq url-proxy-services
- '(("http" . "proxy.aventail.com:80")
- ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
- @end example
- @end defopt
- @node Gateways in general
- @section Gateways in General
- @cindex gateways
- @cindex firewalls
- The library provides a general gateway layer through which all
- networking passes. It can both control access to the network and
- provide access through gateways in firewalls. This may make direct
- connections in some cases and pass through some sort of gateway in
- others.@footnote{Proxies (which only operate over HTTP) are
- implemented using this.} The library's basic function responsible for
- making connections is @code{url-open-stream}.
- @defun url-open-stream name buffer host service
- @cindex opening a stream
- @cindex stream, opening
- Open a stream to @var{host}, possibly via a gateway. The other
- arguments are as for @code{open-network-stream}. This will not make a
- connection if @code{url-gateway-unplugged} is non-@code{nil}.
- @end defun
- @defvar url-gateway-local-host-regexp
- This is a regular expression that matches local hosts that do not
- require the use of a gateway. If @code{nil}, all connections are made
- through the gateway.
- @end defvar
- @defvar url-gateway-method
- This variable controls which gateway method is used. It may be useful
- to bind it temporarily in some applications. It has values taken from
- a list of symbols. Possible values are:
- @table @code
- @item telnet
- @cindex @command{telnet}
- Use this method if you must first telnet and log into a gateway host,
- and then run telnet from that host to connect to outside machines.
- @item rlogin
- @cindex @command{rlogin}
- This method is identical to @code{telnet}, but uses @command{rlogin}
- to log into the remote machine without having to send the username and
- password over the wire every time.
- @item socks
- @cindex @sc{socks}
- Use if the firewall has a @sc{socks} gateway running on it. The
- @sc{socks} v5 protocol is defined in RFC 1928.
- @c @item ssl
- @c This probably shouldn't be documented
- @c Fixme: why not? -- fx
- @item native
- This method uses Emacs's builtin networking directly. This is the
- default. It can be used only if there is no firewall blocking access.
- @end table
- @end defvar
- The following variables control the gateway methods.
- @defopt url-gateway-telnet-host
- The gateway host to telnet to. Once logged in there, you then telnet
- out to the hosts you want to connect to.
- @end defopt
- @defopt url-gateway-telnet-parameters
- This should be a list of parameters to pass to the @command{telnet} program.
- @end defopt
- @defopt url-gateway-telnet-password-prompt
- This is a regular expression that matches the password prompt when
- logging in.
- @end defopt
- @defopt url-gateway-telnet-login-prompt
- This is a regular expression that matches the username prompt when
- logging in.
- @end defopt
- @defopt url-gateway-telnet-user-name
- The username to log in with.
- @end defopt
- @defopt url-gateway-telnet-password
- The password to send when logging in.
- @end defopt
- @defopt url-gateway-prompt-pattern
- This is a regular expression that matches the shell prompt.
- @end defopt
- @defopt url-gateway-rlogin-host
- Host to @samp{rlogin} to before telnetting out.
- @end defopt
- @defopt url-gateway-rlogin-parameters
- Parameters to pass to @samp{rsh}.
- @end defopt
- @defopt url-gateway-rlogin-user-name
- User name to use when logging in to the gateway.
- @end defopt
- @defopt url-gateway-prompt-pattern
- This is a regular expression that matches the shell prompt.
- @end defopt
- @defopt socks-server
- This specifies the default server, it takes the form
- @w{@code{("Default server" @var{server} @var{port} @var{version})}}
- where @var{version} can be either 4 or 5.
- @end defopt
- @defvar socks-password
- If this is @code{nil} then you will be asked for the password,
- otherwise it will be used as the password for authenticating you to
- the @sc{socks} server.
- @end defvar
- @defvar socks-username
- This is the username to use when authenticating yourself to the
- @sc{socks} server. By default this is your login name.
- @end defvar
- @defvar socks-timeout
- This controls how long, in seconds, to wait for responses from the
- @sc{socks} server; it is 5 by default.
- @end defvar
- @c fixme: these have been effectively commented-out in the code
- @c @defopt socks-server-aliases
- @c This a list of server aliases. It is a list of aliases of the form
- @c @var{(alias hostname port version)}.
- @c @end defopt
- @c @defopt socks-network-aliases
- @c This a list of network aliases. Each entry in the list takes the form
- @c @var{(alias (network))} where @var{alias} is a string that names the
- @c @var{network}. The networks can contain a pair (not a dotted pair) of
- @c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
- @c address and a netmask, a domain name or a unique hostname or @sc{ip}
- @c address.
- @c @end defopt
- @c @defopt socks-redirection-rules
- @c This a list of redirection rules. Each rule take the form
- @c @var{(Destination network Connection type)} where @var{Destination
- @c network} is a network alias from @code{socks-network-aliases} and
- @c @var{Connection type} can be @code{nil} in which case a direct
- @c connection is used, or it can be an alias from
- @c @code{socks-server-aliases} in which case that server is used as a
- @c proxy.
- @c @end defopt
- @defopt socks-nslookup-program
- @cindex @command{nslookup}
- This the @samp{nslookup} program. It is @code{"nslookup"} by default.
- @end defopt
- @menu
- * Suppressing network connections::
- @end menu
- @c * Broken hostname resolution::
- @node Suppressing network connections
- @subsection Suppressing Network Connections
- @cindex network connections, suppressing
- @cindex suppressing network connections
- @cindex bugs, HTML
- @cindex HTML ``bugs''
- In some circumstances it is desirable to suppress making network
- connections. A typical case is when rendering HTML in a mail user
- agent, when external URLs should not be activated, particularly to
- avoid ``bugs'' which ``call home'' by fetch single-pixel images and the
- like. To arrange this, bind the following variable for the duration
- of such processing.
- @defvar url-gateway-unplugged
- If this variable is non-@code{nil} new network connections are never
- opened by the URL library.
- @end defvar
- @c @node Broken hostname resolution
- @c @subsection Broken Hostname Resolution
- @c @cindex hostname resolver
- @c @cindex resolver, hostname
- @c Some C libraries do not include the hostname resolver routines in
- @c their static libraries. If Emacs was linked statically, and was not
- @c linked with the resolver libraries, it will not be able to get to any
- @c machines off the local network. This is characterized by being able
- @c to reach someplace with a raw ip number, but not its hostname
- @c (@url{http://129.79.254.191/} works, but
- @c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
- @c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
- @c rebuilt linked against the resolver library, it can use the external
- @c @command{nslookup} program instead.
- @c @defopt url-gateway-broken-resolution
- @c @cindex @code{nslookup} program
- @c @cindex program, @code{nslookup}
- @c If non-@code{nil}, this variable says to use the program specified by
- @c @code{url-gateway-nslookup-program} program to do hostname resolution.
- @c @end defopt
- @c @defopt url-gateway-nslookup-program
- @c The name of the program to do hostname lookup if Emacs can't do it
- @c directly. This program should expect a single argument on the command
- @c line---the hostname to resolve---and should produce output similar to
- @c the standard Unix @command{nslookup} program:
- @c @example
- @c Name: www.cs.indiana.edu
- @c Address: 129.79.254.191
- @c @end example
- @c @end defopt
- @node History
- @section History
- @findex url-do-setup
- The library can maintain a global history list tracking URLs accessed.
- URL completion can be done from it. The history mechanism is set up
- automatically via @code{url-do-setup} when it is configured to be on.
- Note that the size of the history list is currently not limited.
- @vindex url-history-hash-table
- The history ``list'' is actually a hash table,
- @code{url-history-hash-table}. It contains access times keyed by URL
- strings. The times are in the format returned by @code{current-time}.
- @defun url-history-update-url url time
- This function updates the history table with an entry for @var{url}
- accessed at the given @var{time}.
- @end defun
- @defopt url-history-track
- If non-@code{nil}, the library will keep track of all the URLs
- accessed. If it is @code{t}, the list is saved to disk at the end of
- each Emacs session. The default is @code{nil}.
- @end defopt
- @defopt url-history-file
- The file storing the history list between sessions. It defaults to
- @file{history} in @code{url-configuration-directory}.
- @end defopt
- @defopt url-history-save-interval
- @findex url-history-setup-save-timer
- The number of seconds between automatic saves of the history list.
- Default is one hour. Note that if you change this variable directly,
- rather than using Custom, after @code{url-do-setup} has been run, you
- need to run the function @code{url-history-setup-save-timer}.
- @end defopt
- @defun url-history-parse-history &optional fname
- Parses the history file @var{fname} (default @code{url-history-file})
- and sets up the history list.
- @end defun
- @defun url-history-save-history &optional fname
- Saves the current history to file @var{fname} (default
- @code{url-history-file}).
- @end defun
- @defun url-completion-function string predicate function
- You can use this function to do completion of URLs from the history.
- @end defun
- @node Customization
- @chapter Customization
- @cindex environment variables
- The following environment variables affect the @code{url} library's
- operation at startup.
- @table @code
- @item TMPDIR
- @vindex TMPDIR
- @vindex url-temporary-directory
- If this is defined, @var{url-temporary-directory} is initialized from
- it.
- @end table
- The following user options affect the general operation of
- @code{url} library.
- @defopt url-configuration-directory
- @cindex configuration files
- The value of this variable specifies the name of the directory where
- the @code{url} library stores its various configuration files, cache
- files, etc.
- The default value specifies a subdirectory named @file{url/} in the
- standard Emacs user data directory specified by the variable
- @code{user-emacs-directory} (normally @file{~/.emacs.d}). However,
- the old default was @file{~/.url}, and this directory is used instead
- if it exists.
- @end defopt
- @defopt url-debug
- @cindex debugging
- Specifies the types of debug messages which are logged to
- the @file{*URL-DEBUG*} buffer.
- @code{t} means log all messages.
- A number means log all messages and show them with @code{message}.
- It may also be a list of the types of messages to be logged.
- @end defopt
- @defopt url-personal-mail-address
- @end defopt
- @defopt url-privacy-level
- @end defopt
- @defopt url-uncompressor-alist
- @end defopt
- @defopt url-passwd-entry-func
- @end defopt
- @defopt url-standalone-mode
- @end defopt
- @defopt url-bad-port-list
- @end defopt
- @defopt url-max-password-attempts
- @end defopt
- @defopt url-temporary-directory
- @end defopt
- @defopt url-show-status
- @end defopt
- @defopt url-confirmation-func
- The function to use for asking yes or no functions. This is normally
- either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
- function taking a single argument (the prompt) and returning @code{t}
- only if an affirmative answer is given.
- @end defopt
- @defopt url-gateway-method
- @c fixme: describe gatewaying
- A symbol specifying the type of gateway support to use for connections
- from the local machine. The supported methods are:
- @table @code
- @item telnet
- Run telnet in a subprocess to connect;
- @item rlogin
- Rlogin to another machine to connect;
- @item socks
- Connect through a socks server;
- @item ssl
- Connect with SSL;
- @item native
- Connect directly.
- @end table
- @end defopt
- @defopt url-user-agent
- The User Agent string used for sending @acronym{HTTP}/@acronym{HTTPS}
- requests. The value should be @code{nil}, which means that no
- @samp{User-Agent} header is generated, @code{default}, which means
- that a string is generated based on the setting of
- @code{url-privacy-leve}, a string or a function of no arguments that
- returns a string.
- The default is @code{default}, which means that the
- @w{@samp{User-Agent: @var{package-name} URL/Emacs}} string will be
- generated, where @var{package-name} is the value of
- @code{url-package-name} and its version, if they are non-@code{nil}.
- @end defopt
- @node GNU Free Documentation License
- @appendix GNU Free Documentation License
- @include doclicense.texi
- @node Function Index
- @unnumbered Command and Function Index
- @printindex fn
- @node Variable Index
- @unnumbered Variable Index
- @printindex vr
- @node Concept Index
- @unnumbered Concept Index
- @printindex cp
- @bye
|