mirror of
https://git.savannah.gnu.org/git/emacs.git
synced 2024-12-03 08:30:09 +00:00
1383 lines
42 KiB
Plaintext
1383 lines
42 KiB
Plaintext
\input texinfo @c -*-mode: texinfo; coding: latin-1 -*-
|
||
|
||
@setfilename ../info/emacs-mime
|
||
@settitle Emacs MIME Manual
|
||
@synindex fn cp
|
||
@synindex vr cp
|
||
@synindex pg cp
|
||
|
||
@copying
|
||
This file documents the Emacs MIME interface functionality.
|
||
|
||
Copyright (C) 1998, 1999, 2000, 2002 Free Software Foundation, Inc.
|
||
|
||
@quotation
|
||
Permission is granted to copy, distribute and/or modify this document
|
||
under the terms of the GNU Free Documentation License, Version 1.1 or
|
||
any later version published by the Free Software Foundation; with no
|
||
Invariant Sections, with the Front-Cover texts being ``A GNU
|
||
Manual,'' and with the Back-Cover Texts as in (a) below. A copy of the
|
||
license is included in the section entitled ``GNU Free Documentation
|
||
License'' in the Emacs manual.
|
||
|
||
(a) The FSF's Back-Cover Text is: ``You have freedom to copy and modify
|
||
this GNU Manual, like GNU software. Copies published by the Free
|
||
Software Foundation raise funds for GNU development.''
|
||
|
||
This document is part of a collection distributed under the GNU Free
|
||
Documentation License. If you want to distribute this document
|
||
separately from the collection, you can do so by adding a copy of the
|
||
license to the document, as described in section 6 of the license.
|
||
@end quotation
|
||
@end copying
|
||
|
||
@dircategory Emacs
|
||
@direntry
|
||
* MIME: (emacs-mime). Emacs MIME de/composition library.
|
||
@end direntry
|
||
@iftex
|
||
@finalout
|
||
@end iftex
|
||
@setchapternewpage odd
|
||
|
||
@titlepage
|
||
@title Emacs MIME Manual
|
||
|
||
@author by Lars Magne Ingebrigtsen
|
||
@page
|
||
@vskip 0pt plus 1filll
|
||
@insertcopying
|
||
@end titlepage
|
||
|
||
|
||
@node Top
|
||
@top Emacs MIME
|
||
|
||
This manual documents the libraries used to compose and display
|
||
@sc{mime} messages.
|
||
|
||
This is not a manual meant for users; it's a manual directed at people
|
||
who want to write functions and commands that manipulate @sc{mime}
|
||
elements.
|
||
|
||
@sc{mime} is short for @dfn{Multipurpose Internet Mail Extensions}.
|
||
This standard is documented in a number of RFCs; mainly RFC2045 (Format
|
||
of Internet Message Bodies), RFC2046 (Media Types), RFC2047 (Message
|
||
Header Extensions for Non-ASCII Text), RFC2048 (Registration
|
||
Procedures), RFC2049 (Conformance Criteria and Examples). It is highly
|
||
recommended that anyone who intends writing @sc{mime}-compliant software
|
||
read at least RFC2045 and RFC2047.
|
||
|
||
@menu
|
||
* Interface Functions:: An abstraction over the basic functions.
|
||
* Basic Functions:: Utility and basic parsing functions.
|
||
* Decoding and Viewing:: A framework for decoding and viewing.
|
||
* Composing:: MML; a language for describing MIME parts.
|
||
* Standards:: A summary of RFCs and working documents used.
|
||
* Index:: Function and variable index.
|
||
@end menu
|
||
|
||
|
||
@node Interface Functions
|
||
@chapter Interface Functions
|
||
@cindex interface functions
|
||
@cindex mail-parse
|
||
|
||
The @code{mail-parse} library is an abstraction over the actual
|
||
low-level libraries that are described in the next chapter.
|
||
|
||
Standards change, and so programs have to change to fit in the new
|
||
mold. For instance, RFC2045 describes a syntax for the
|
||
@code{Content-Type} header that only allows @sc{ascii} characters in the
|
||
parameter list. RFC2231 expands on RFC2045 syntax to provide a scheme
|
||
for continuation headers and non-@sc{ascii} characters.
|
||
|
||
The traditional way to deal with this is just to update the library
|
||
functions to parse the new syntax. However, this is sometimes the wrong
|
||
thing to do. In some instances it may be vital to be able to understand
|
||
both the old syntax as well as the new syntax, and if there is only one
|
||
library, one must choose between the old version of the library and the
|
||
new version of the library.
|
||
|
||
The Emacs MIME library takes a different tack. It defines a series of
|
||
low-level libraries (@file{rfc2047.el}, @file{rfc2231.el} and so on)
|
||
that parses strictly according to the corresponding standard. However,
|
||
normal programs would not use the functions provided by these libraries
|
||
directly, but instead use the functions provided by the
|
||
@code{mail-parse} library. The functions in this library are just
|
||
aliases to the corresponding functions in the latest low-level
|
||
libraries. Using this scheme, programs get a consistent interface they
|
||
can use, and library developers are free to create write code that
|
||
handles new standards.
|
||
|
||
The following functions are defined by this library:
|
||
|
||
@defun mail-header-parse-content-type string
|
||
Parse @var{string}, a @code{Content-Type} header, and return a
|
||
content-type list in the following format:
|
||
|
||
@lisp
|
||
("type/subtype"
|
||
(attribute1 . value1)
|
||
(attribute2 . value2)
|
||
@dots{})
|
||
@end lisp
|
||
|
||
Here's an example:
|
||
|
||
@example
|
||
(mail-header-parse-content-type
|
||
"image/gif; name=\"b980912.gif\"")
|
||
@result{} ("image/gif" (name . "b980912.gif"))
|
||
@end example
|
||
@end defun
|
||
|
||
@defun mail-header-parse-content-disposition string
|
||
Parse @var{string}, a @code{Content-Disposition} header, and return a
|
||
content-type list in the format above.
|
||
@end defun
|
||
|
||
@defun mail-content-type-get ct attribute
|
||
@findex mail-content-type-get
|
||
Returns the value of the given @var{attribute} from the content-type
|
||
list @var{ct}.
|
||
|
||
@example
|
||
(mail-content-type-get
|
||
'("image/gif" (name . "b980912.gif")) 'name)
|
||
@result{} "b980912.gif"
|
||
@end example
|
||
@end defun
|
||
|
||
@defun mail-header-encode-parameter param value
|
||
Takes a parameter string @samp{@var{param}=@var{value}} and returns an
|
||
encoded version of it. This is used for parameters in headers like
|
||
@samp{Content-Type} and @samp{Content-Disposition}.
|
||
@end defun
|
||
|
||
@defun mail-header-remove-comments string
|
||
Return a comment-free version of @var{string}.
|
||
|
||
@example
|
||
(mail-header-remove-comments
|
||
"Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)")
|
||
@result{} "Gnus/5.070027 "
|
||
@end example
|
||
@end defun
|
||
|
||
@defun mail-header-remove-whitespace string
|
||
Remove linear white space from @var{string}. Space inside quoted
|
||
strings and comments is preserved.
|
||
|
||
@example
|
||
(mail-header-remove-whitespace
|
||
"image/gif; name=\"Name with spaces\"")
|
||
@result{} "image/gif;name=\"Name with spaces\""
|
||
@end example
|
||
@end defun
|
||
|
||
@defun mail-header-get-comment string
|
||
Return the last comment in @var{string}.
|
||
|
||
@example
|
||
(mail-header-get-comment
|
||
"Gnus/5.070027 (Pterodactyl Gnus v0.27) (Finnish Landrace)")
|
||
@result{} "Finnish Landrace"
|
||
@end example
|
||
@end defun
|
||
|
||
|
||
@defun mail-header-parse-address string
|
||
Parse an address string @var{string} and return a list containing the
|
||
mailbox and the plaintext name.
|
||
|
||
@example
|
||
(mail-header-parse-address
|
||
"Hrvoje Niksic <hniksic@@srce.hr>")
|
||
@result{} ("hniksic@@srce.hr" . "Hrvoje Niksic")
|
||
@end example
|
||
@end defun
|
||
|
||
@defun mail-header-parse-addresses string
|
||
Parse @var{string} as a list of addresses and return a list of elements
|
||
like the one described above.
|
||
|
||
@example
|
||
(mail-header-parse-addresses
|
||
"Hrvoje Niksic <hniksic@@srce.hr>, Steinar Bang <sb@@metis.no>")
|
||
@result{} (("hniksic@@srce.hr" . "Hrvoje Niksic")
|
||
("sb@@metis.no" . "Steinar Bang"))
|
||
@end example
|
||
@end defun
|
||
|
||
@defun mail-header-parse-date string
|
||
Parse a date @var{string} and return an Emacs time structure.
|
||
@end defun
|
||
|
||
@defun mail-narrow-to-head
|
||
Narrow the buffer to the header section of the buffer. Point is placed
|
||
at the beginning of the narrowed buffer.
|
||
@end defun
|
||
|
||
@defun mail-header-narrow-to-field
|
||
Narrow the buffer to the header under point.
|
||
@end defun
|
||
|
||
@defun mail-encode-encoded-word-region start end
|
||
Encode the non-@sc{ascii} words in the region @var{start}to @var{end}. For
|
||
instance, @samp{Na<4E>ve} is encoded as @samp{=?iso-8859-1?q?Na=EFve?=}.
|
||
@end defun
|
||
|
||
@defun mail-encode-encoded-word-buffer
|
||
Encode the non-@sc{ascii} words in the current buffer. This function is
|
||
meant to be called with the buffer narrowed to the headers of a message.
|
||
@end defun
|
||
|
||
@defun mail-encode-encoded-word-string string
|
||
Encode the words that need encoding in @var{string}, and return the
|
||
result.
|
||
|
||
@example
|
||
(mail-encode-encoded-word-string
|
||
"This is na<6E>ve, baby")
|
||
@result{} "This is =?iso-8859-1?q?na=EFve,?= baby"
|
||
@end example
|
||
@end defun
|
||
|
||
@defun mail-decode-encoded-word-region start end
|
||
Decode the encoded words in the region @var{start}to @var{end}.
|
||
@end defun
|
||
|
||
@defun mail-decode-encoded-word-string string
|
||
Decode the encoded words in @var{string} and return the result.
|
||
|
||
@example
|
||
(mail-decode-encoded-word-string
|
||
"This is =?iso-8859-1?q?na=EFve,?= baby")
|
||
@result{} "This is na<6E>ve, baby"
|
||
@end example
|
||
@end defun
|
||
|
||
Currently, @code{mail-parse} is an abstraction over @code{ietf-drums},
|
||
@code{rfc2047}, @code{rfc2045} and @code{rfc2231}. These are documented
|
||
in the subsequent sections.
|
||
|
||
|
||
|
||
@node Basic Functions
|
||
@chapter Basic Functions
|
||
|
||
This chapter describes the basic, ground-level functions for parsing and
|
||
handling. Covered here is parsing @code{From} lines, removing comments
|
||
from header lines, decoding encoded words, parsing date headers and so
|
||
on. High-level functionality is dealt with in the next chapter
|
||
(@pxref{Decoding and Viewing}).
|
||
|
||
@menu
|
||
* rfc2045:: Encoding @code{Content-Type} headers.
|
||
* rfc2231:: Parsing @code{Content-Type} headers.
|
||
* ietf-drums:: Handling mail headers defined by RFC822bis.
|
||
* rfc2047:: En/decoding encoded words in headers.
|
||
* time-date:: Functions for parsing dates and manipulating time.
|
||
* qp:: Quoted-Printable en/decoding.
|
||
* base64:: Base64 en/decoding.
|
||
* binhex:: Binhex decoding.
|
||
* uudecode:: Uuencode decoding.
|
||
* rfc1843:: Decoding HZ-encoded text.
|
||
* mailcap:: How parts are displayed is specified by mailcap files
|
||
@end menu
|
||
|
||
|
||
@node rfc2045
|
||
@section rfc2045
|
||
|
||
RFC2045 is the ``main'' @sc{mime} document, and as such, one would
|
||
imagine that there would be a lot to implement. But there isn't, since
|
||
most of the implementation details are delegated to the subsequent
|
||
RFCs.
|
||
|
||
So @file{rfc2045.el} has only a single function:
|
||
|
||
@defun rfc2045-encode-string parameter value
|
||
@findex rfc2045-encode-string
|
||
Takes a @var{parameter} and a @var{value} and returns a
|
||
@samp{@var{param}=@var{value}} string. @var{value} will be quoted if
|
||
there are non-safe characters in it.
|
||
@end defun
|
||
|
||
|
||
@node rfc2231
|
||
@section rfc2231
|
||
|
||
RFC2231 defines a syntax for the @samp{Content-Type} and
|
||
@samp{Content-Disposition} headers. Its snappy name is @dfn{MIME
|
||
Parameter Value and Encoded Word Extensions: Character Sets, Languages,
|
||
and Continuations}.
|
||
|
||
In short, these headers look something like this:
|
||
|
||
@example
|
||
Content-Type: application/x-stuff;
|
||
title*0*=us-ascii'en'This%20is%20even%20more%20;
|
||
title*1*=%2A%2A%2Afun%2A%2A%2A%20;
|
||
title*2="isn't it!"
|
||
@end example
|
||
|
||
They usually aren't this bad, though.
|
||
|
||
The following functions are defined by this library:
|
||
|
||
@defun rfc2231-parse-string string
|
||
Parse a @samp{Content-Type} header @var{string} and return a list
|
||
describing its elements.
|
||
|
||
@example
|
||
(rfc2231-parse-string
|
||
"application/x-stuff;
|
||
title*0*=us-ascii'en'This%20is%20even%20more%20;
|
||
title*1*=%2A%2A%2Afun%2A%2A%2A%20;
|
||
title*2=\"isn't it!\"")
|
||
@result{} ("application/x-stuff"
|
||
(title . "This is even more ***fun*** isn't it!"))
|
||
@end example
|
||
@end defun
|
||
|
||
@defun rfc2231-get-value ct attribute
|
||
Takes a list @var{ct} of the format above and returns the value of the
|
||
specified @var{attribute}.
|
||
@end defun
|
||
|
||
@defun rfc2231-encode-string parameter value
|
||
Encode the string @samp{@var{parameter}=@var{value}} for inclusion in
|
||
headers likes @samp{Content-Type} and @samp{Content-Disposition}.
|
||
@end defun
|
||
|
||
@node ietf-drums
|
||
@section ietf-drums
|
||
|
||
@dfn{drums} is an IETF working group that is working on the replacement
|
||
for RFC822.
|
||
|
||
The functions provided by this library include:
|
||
|
||
@defun ietf-drums-remove-comments string
|
||
Remove the comments from @var{string} and return the result.
|
||
@end defun
|
||
|
||
@defun ietf-drums-remove-whitespace string
|
||
Remove linear white space from @var{string} and return the result.
|
||
Spaces inside quoted strings and comments are left untouched.
|
||
@end defun
|
||
|
||
@defun ietf-drums-get-comment string
|
||
Return the last most comment from @var{string}.
|
||
@end defun
|
||
|
||
@defun ietf-drums-parse-address string
|
||
Parse an address @var{string} and return a list of the mailbox and the
|
||
plain text name.
|
||
@end defun
|
||
|
||
@defun ietf-drums-parse-addresses string
|
||
Parse @var{string}, containing any number of comma-separated addresses,
|
||
and return a list of mailbox/plain text pairs.
|
||
@end defun
|
||
|
||
@defun ietf-drums-parse-date string
|
||
Parse the date @var{string} and return an Emacs time structure.
|
||
@end defun
|
||
|
||
@defun ietf-drums-narrow-to-header
|
||
Narrow the buffer to the header section of the current buffer.
|
||
@end defun
|
||
|
||
|
||
@node rfc2047
|
||
@section rfc2047
|
||
|
||
RFC2047 (Message Header Extensions for Non-ASCII Text) specifies how
|
||
non-@sc{ascii} text in headers are to be encoded. This is actually rather
|
||
complicated, so a number of variables are necessary to tweak what this
|
||
library does.
|
||
|
||
The following variables are tweakable:
|
||
|
||
@defvar rfc2047-default-charset
|
||
Characters in this charset should not be decoded by this library.
|
||
This defaults to @samp{iso-8859-1}.
|
||
@end defvar
|
||
|
||
@defvar rfc2047-header-encoding-list
|
||
This is an alist of header / encoding-type pairs. Its main purpose is
|
||
to prevent encoding of certain headers.
|
||
@end defvar
|
||
|
||
The keys can either be header regexps, or @code{t}.
|
||
|
||
The values can be either @code{nil}, in which case the header(s) in
|
||
question won't be encoded, or @code{mime}, which means that they will be
|
||
encoded.
|
||
|
||
@defvar rfc2047-charset-encoding-alist
|
||
RFC2047 specifies two forms of encoding---@code{Q} (a
|
||
Quoted-Printable-like encoding) and @code{B} (base64). This alist
|
||
specifies which charset should use which encoding.
|
||
@end defvar
|
||
|
||
@defvar rfc2047-encoding-function-alist
|
||
This is an alist of encoding / function pairs. The encodings are
|
||
@code{Q}, @code{B} and @code{nil}.
|
||
@end defvar
|
||
|
||
@defvar rfc2047-q-encoding-alist
|
||
The @code{Q} encoding isn't quite the same for all headers. Some
|
||
headers allow a narrower range of characters, and that is what this
|
||
variable is for. It's an alist of header regexps and allowable character
|
||
ranges.
|
||
@end defvar
|
||
|
||
@defvar rfc2047-encoded-word-regexp
|
||
When decoding words, this library looks for matches to this regexp.
|
||
@end defvar
|
||
|
||
Those were the variables, and these are the functions:
|
||
|
||
@defun rfc2047-narrow-to-field
|
||
Narrow the buffer to the header on the current line.
|
||
@end defun
|
||
|
||
@defun rfc2047-encode-message-header
|
||
Should be called narrowed to the header of a message. Encodes according
|
||
to @code{rfc2047-header-encoding-alist}.
|
||
@end defun
|
||
|
||
@defun rfc2047-encode-region start end
|
||
Encodes all encodable words in the region @var{start} to @var{end}.
|
||
@end defun
|
||
|
||
@defun rfc2047-encode-string string
|
||
Encode @var{string} and return the result.
|
||
@end defun
|
||
|
||
@defun rfc2047-decode-region start end
|
||
Decode the encoded words in the region @var{start} to @var{end}.
|
||
@end defun
|
||
|
||
@defun rfc2047-decode-string string
|
||
Decode @var{string} and return the result.
|
||
@end defun
|
||
|
||
|
||
|
||
@node time-date
|
||
@section time-date
|
||
|
||
While not really a part of the @sc{mime} library, it is convenient to
|
||
document this library here. It deals with parsing @samp{Date} headers
|
||
and manipulating time. (Not by using tesseracts, though, I'm sorry to
|
||
say.)
|
||
|
||
These functions convert between five formats: a date string, an Emacs
|
||
time structure, a decoded time list, a number of seconds, and a day number.
|
||
|
||
The functions have quite self-explanatory names, so the following just
|
||
gives an overview of which functions are available.
|
||
|
||
@findex parse-time-string
|
||
@findex date-to-time
|
||
@findex time-to-seconds
|
||
@findex seconds-to-time
|
||
@findex time-to-day
|
||
@findex days-to-time
|
||
@findex time-since
|
||
@findex time-less-p
|
||
@findex subtract-time
|
||
@findex days-between
|
||
@findex date-leap-year-p
|
||
@findex time-to-day-in-year
|
||
@example
|
||
(parse-time-string "Sat Sep 12 12:21:54 1998 +0200")
|
||
@result{} (54 21 12 12 9 1998 6 nil 7200)
|
||
|
||
(date-to-time "Sat Sep 12 12:21:54 1998 +0200")
|
||
@result{} (13818 19266)
|
||
|
||
(time-to-seconds '(13818 19266))
|
||
@result{} 905595714.0
|
||
|
||
(seconds-to-time 905595714.0)
|
||
@result{} (13818 19266 0)
|
||
|
||
(time-to-day '(13818 19266))
|
||
@result{} 729644
|
||
|
||
(days-to-time 729644)
|
||
@result{} (961933 65536)
|
||
|
||
(time-since '(13818 19266))
|
||
@result{} (0 430)
|
||
|
||
(time-less-p '(13818 19266) '(13818 19145))
|
||
@result{} nil
|
||
|
||
(subtract-time '(13818 19266) '(13818 19145))
|
||
@result{} (0 121)
|
||
|
||
(days-between "Sat Sep 12 12:21:54 1998 +0200"
|
||
"Sat Sep 07 12:21:54 1998 +0200")
|
||
@result{} 5
|
||
|
||
(date-leap-year-p 2000)
|
||
@result{} t
|
||
|
||
(time-to-day-in-year '(13818 19266))
|
||
@result{} 255
|
||
@end example
|
||
|
||
@findex safe-date-to-time
|
||
And finally, we have @code{safe-date-to-time}, which does the same as
|
||
@code{date-to-time}, but returns a zero time if the date is
|
||
syntactically malformed.
|
||
|
||
|
||
|
||
@node qp
|
||
@section qp
|
||
|
||
This library deals with decoding and encoding Quoted-Printable text.
|
||
|
||
Very briefly explained, QP encoding means translating all 8-bit
|
||
characters (and lots of control characters) into things that look like
|
||
@samp{=EF}; that is, an equal sign followed by the byte encoded as a hex
|
||
string. It is defined in RFC 2045.
|
||
|
||
The following functions are defined by the library:
|
||
|
||
@deffn Command quoted-printable-decode-region @var{from} @var{to} &optional @var{coding-system}
|
||
QP-decode all the encoded text in the region. If @var{coding-system}
|
||
is non-nil, decode bytes into characters with that coding-system. It
|
||
is probably better not to use @var{coding-system}; instead decode into
|
||
a unibyte buffer, decode that appropriately and then interpret it as
|
||
multibyte.
|
||
@end deffn
|
||
|
||
@defun quoted-printable-decode-string @var{string} &optional @var{coding-system}
|
||
Return a QP-encoded copy of @var{string}. If @var{coding-system} is
|
||
non-nil, decode bytes into characters with that coding-system.
|
||
@end defun
|
||
|
||
@deffn Command quoted-printable-encode-region @var{from} @var{to} &optional @var{fold} @var{class}
|
||
QP-encode all the region. If @var{fold} is non-@var{nil}, fold lines
|
||
at 76 characters, as required by the RFC. If @var{class} is
|
||
non-@code{nil}, translate the characters not matched by that regexp
|
||
class, which should be in the form expected by
|
||
@var{skip-chars-forward} and should probably not contain literal
|
||
eight-bit characters. Specifying @var{class} makes sense to do extra
|
||
encoding in header fields.
|
||
|
||
If variable @var{mm-use-ultra-safe-encoding} is defined and
|
||
non-@code{nil}, fold lines unconditionally and encode @samp{From } and
|
||
@samp{-} at the start of lines..
|
||
@end deffn
|
||
|
||
@defun quoted-printable-encode-string string
|
||
Return a QP-encoded copy of @var{string}.
|
||
@end defun
|
||
|
||
@node base64
|
||
@section base64
|
||
@cindex base64
|
||
|
||
Base64 is an encoding that encodes three bytes into four characters,
|
||
thereby increasing the size by about 33%. The alphabet used for
|
||
encoding is very resistant to mangling during transit. @xref{Base
|
||
64,,Base 64 Encoding, elisp, The Emacs Lisp Reference Manual}.
|
||
|
||
@node binhex
|
||
@section binhex
|
||
@cindex binhex
|
||
@cindex Apple
|
||
@cindex Macintosh
|
||
|
||
Binhex is an encoding that originated in Macintosh environments.
|
||
The following function is supplied to deal with these:
|
||
|
||
@defun binhex-decode-region start end &optional header-only
|
||
Decode the encoded text in the region @var{start} to @var{end}. If
|
||
@var{header-only} is non-@code{nil}, only decode the @samp{binhex}
|
||
header and return the file name.
|
||
@end defun
|
||
|
||
|
||
@node uudecode
|
||
@section uudecode
|
||
@cindex uuencode
|
||
@cindex uudecode
|
||
|
||
Uuencoding is probably still the most popular encoding of binaries
|
||
used on Usenet, although Base64 rules the mail world.
|
||
|
||
The following function is supplied by this package:
|
||
|
||
@defun uudecode-decode-region start end &optional file-name
|
||
Decode the text in the region @var{start} to @var{end}. If
|
||
@var{file-name} is non-@code{nil}, save the result to @var{file-name}.
|
||
@end defun
|
||
|
||
|
||
@node rfc1843
|
||
@section rfc1843
|
||
@cindex rfc1843
|
||
@cindex HZ
|
||
@cindex Chinese
|
||
|
||
RFC1843 deals with mixing Chinese and @sc{ascii} characters in messages. In
|
||
essence, RFC1843 switches between @sc{ascii} and Chinese by doing this:
|
||
|
||
@example
|
||
This sentence is in ASCII.
|
||
The next sentence is in GB.~@{<:Ky2;S@{#,NpJ)l6HK!#~@}Bye.
|
||
@end example
|
||
|
||
Simple enough, and widely used in China.
|
||
|
||
The following functions are available to handle this encoding:
|
||
|
||
@defun rfc1843-decode-region start end
|
||
Decode HZ-encoded text in the region @var{start} to @var{end}.
|
||
@end defun
|
||
|
||
@defun rfc1843-decode-string string
|
||
Decode the HZ-encoded @var{string} and return the result.
|
||
@end defun
|
||
|
||
|
||
@node mailcap
|
||
@section mailcap
|
||
|
||
As specified by RFC 1524, @sc{mime}-aware message handlers parse
|
||
@dfn{mailcap} files from a default list, which can be overridden by the
|
||
@code{MAILCAP} environment variable. These describe how elements are
|
||
supposed to be displayed. Here's an example file:
|
||
|
||
@example
|
||
image/*; gimp -8 %s
|
||
audio/wav; wavplayer %s
|
||
@end example
|
||
|
||
This says that all image files should be displayed with @command{gimp},
|
||
and that WAVE audio files should be played by @code{wavplayer}.
|
||
|
||
The @code{mailcap} library parses such files, and provides functions for
|
||
matching types.
|
||
|
||
@defvar mailcap-mime-data
|
||
This variable is an alist of alists containing backup viewing rules for
|
||
@sc{mime} types. These are overridden by rules for a type found in
|
||
mailcap files. The outer alist is keyed on the major content-type and
|
||
the inner alists are keyed on the minor content-type (which can be a
|
||
regular expression).
|
||
|
||
@c Fixme: document this properly!
|
||
For example:
|
||
@example
|
||
(("application"
|
||
("octet-stream"
|
||
(viewer . mailcap-save-binary-file)
|
||
(non-viewer . t)
|
||
(type . "application/octet-stream"))
|
||
("plain"
|
||
(viewer . view-mode)
|
||
(test fboundp 'view-mode)
|
||
(type . "text/plain")))
|
||
@end example
|
||
@end defvar
|
||
|
||
@defopt mailcap-default-mime-data
|
||
This variable is the default value of @code{mailcap-mime-data}. It
|
||
exists to allow setting the value using Custom. It is merged with
|
||
values from mailcap files by @code{mailcap-parse-mailcaps}.
|
||
@end defopt
|
||
|
||
Although it is not specified by the RFC, @sc{mime} tools normally use a
|
||
common means of associating file extensions with defualt @sc{mime} types
|
||
in the absence of other information about the type of a file. The
|
||
information is found in per-user files @file{~/.mime.types} and system
|
||
@file{mime.types} files found in quasi-standard places. Here is an
|
||
example:
|
||
|
||
@example
|
||
application/x-dvi dvi
|
||
audio/mpeg mpga mpega mp2 mp3
|
||
image/jpeg jpeg jpg jpe
|
||
@end example
|
||
|
||
|
||
@defvar mailcap-mime-extensions
|
||
This variable is an alist @sc{mime} types keyed by file extensions.
|
||
This is overridden by entries found in @file{mime.types} files.
|
||
@end defvar
|
||
|
||
@defopt mailcap-default-mime-extensions
|
||
This variable is the default value of @code{mailcap-mime-extensions}.
|
||
It exists to allow setting the value using Custom. It is merged with
|
||
values from mailcap files by @code{mailcap-parse-mimetypes}.
|
||
@end defopt
|
||
|
||
Interface functions:
|
||
|
||
@defun mailcap-parse-mailcaps &optional path force
|
||
Parse all the mailcap files specified in a path string @var{path} and
|
||
merge them with the values from @code{mailcap-mime-data}. Components of
|
||
@var{path} are separated by the @code{path-separator} character
|
||
appropriate for the system. If @var{force} is non-@code{nil}, the files
|
||
are re-parsed even if they have been parsed already. If @var{path} is
|
||
omitted, use the value of environment variable @code{MAILCAPS} if it is
|
||
set; otherwise (on GNU and Unix) use the path defined in RFC 1524, plus
|
||
@file{/usr/local/etc/mailcap}.
|
||
@end defun
|
||
|
||
@defun mailcap-parse-mimetypes &optional path force
|
||
Parse all the mimetypes specified in a path string @var{path}
|
||
and merge them with the values from @code{mailcap-mime-extensions}.
|
||
Components of @var{path} are separated by the @code{path-separator}
|
||
character appropriate for the system. If @var{path} is omitted, use the
|
||
value of environment variable @code{MIMETYPES} if set; otherwise use a
|
||
default path consistent with that used by @code{mailcap-parse-mailcaps}.
|
||
If @var{force} is non-@code{nil}, the files are re-parsed even if they
|
||
have been parsed already.
|
||
@end defun
|
||
|
||
@defun mailcap-mime-info string &optional request
|
||
Gets the viewer command for content-type @var{string}. @code{nil} is
|
||
returned if none is found. Expects @var{string} to be a complete
|
||
content-type header line.
|
||
|
||
If @var{request} is non-@code{nil} it specifies what information to
|
||
return. If it is nil or the empty string, the viewer (second field of
|
||
the mailcap entry) will be returned. If it is a string, then the
|
||
mailcap field corresponding to that string will be returned
|
||
(@samp{print}, @samp{description}, whatever). If it is a number, all
|
||
the information for this viewer is returned. If it is @code{all}, then
|
||
all possible viewers for this type is returned.
|
||
@end defun
|
||
|
||
@defun mailcap-mime-types
|
||
This function returns a list of all the defined media types.
|
||
@end defun
|
||
|
||
@defun mailcap-extension-to-mime extension
|
||
This function returns the content type defined for a file with the given
|
||
@var{extension}.
|
||
@end defun
|
||
|
||
|
||
@node Decoding and Viewing
|
||
@chapter Decoding and Viewing
|
||
|
||
This chapter deals with decoding and viewing @sc{mime} messages on a
|
||
higher level.
|
||
|
||
The main idea is to first analyze a @sc{mime} article, and then allow
|
||
other programs to do things based on the list of @dfn{handles} that are
|
||
returned as a result of this analysis.
|
||
|
||
@menu
|
||
* Dissection:: Analyzing a @sc{mime} message.
|
||
* Handles:: Handle manipulations.
|
||
* Display:: Displaying handles.
|
||
* Customization:: Variables that affect display.
|
||
* New Viewers:: How to write your own viewers.
|
||
@end menu
|
||
|
||
|
||
@node Dissection
|
||
@section Dissection
|
||
|
||
The @code{mm-dissect-buffer} is the function responsible for dissecting
|
||
a @sc{mime} article. If given a multipart message, it will recursively
|
||
descend the message, following the structure, and return a tree of
|
||
@sc{mime} handles that describes the structure of the message.
|
||
|
||
|
||
@node Handles
|
||
@section Handles
|
||
|
||
A @sc{mime} handle is a list that fully describes a @sc{mime} component.
|
||
|
||
The following macros can be used to access elements from the
|
||
@var{handle} argument:
|
||
|
||
@defmac mm-handle-buffer handle
|
||
Return the buffer that holds the contents of the undecoded @sc{mime}
|
||
part.
|
||
@end defmac
|
||
|
||
@defmac mm-handle-type handle
|
||
Return the parsed @samp{Content-Type} of the part.
|
||
@end defmac
|
||
|
||
@defmac mm-handle-encoding handle
|
||
Return the @samp{Content-Transfer-Encoding} of the part.
|
||
@end defmac
|
||
|
||
@defmac mm-handle-undisplayer handle
|
||
Return the function that can be used to remove the displayed part (if it
|
||
has been displayed).
|
||
@end defmac
|
||
|
||
@defmac mm-handle-set-undisplayer handle function
|
||
Set the undisplayer function for the part to function.
|
||
@end defmac
|
||
|
||
@defmac mm-handle-disposition
|
||
Return the parsed @samp{Content-Disposition} of the part.
|
||
@end defmac
|
||
|
||
@defmac mm-handle-disposition
|
||
Return the description of the part.
|
||
@end defmac
|
||
|
||
@defmac mm-get-content-id id
|
||
Returns the handle(s) referred to by @var{id}, the @samp{Content-ID} of
|
||
the part.
|
||
@end defmac
|
||
|
||
|
||
@node Display
|
||
@section Display
|
||
|
||
Functions for displaying, removing and saving. In the descriptions
|
||
below, `the part' means the @sc{mime} part represented by the
|
||
@var{handle} argument.
|
||
|
||
@defun mm-display-part handle &optional no-default
|
||
Display the part. Return @code{nil} if the part is removed,
|
||
@code{inline} if it is displayed inline or @code{external} if it is
|
||
displayed externally. If @var{no-default} is non-@code{nil}, the part
|
||
is not displayed unless the @sc{mime} type of @var{handle} is defined to
|
||
be displayed inline or there is an display method defined for it; i.e.@:
|
||
no default external method will be used.
|
||
@end defun
|
||
|
||
@defun mm-remove-part handle
|
||
Remove the part if it has been displayed.
|
||
@end defun
|
||
|
||
@defun mm-inlinable-p handle
|
||
Return non-@code{nil} if the part can be displayed inline.
|
||
@end defun
|
||
|
||
@defun mm-automatic-display-p handle
|
||
Return non-@code{nil} if the user has requested automatic display of the
|
||
@sc{mime} type of the part.
|
||
@end defun
|
||
|
||
@defun mm-destroy-part handle
|
||
Free all the resources used by the part.
|
||
@end defun
|
||
|
||
@defun mm-save-part handle
|
||
Save the part to a file. The user is prompted for a file name to use.
|
||
@end defun
|
||
|
||
@defun mm-pipe-part handle
|
||
Pipe the part through a shell command. The user is prompted for the
|
||
command to use.
|
||
@end defun
|
||
|
||
@defun mm-interactively-view-part handle
|
||
Prompt for a mailcap method to use to view the part and display it
|
||
externally using that method.
|
||
@end defun
|
||
|
||
|
||
@node Customization
|
||
@section Customization
|
||
|
||
The display of @sc{mime} types may be customized with the following
|
||
options.
|
||
|
||
@defopt mm-inline-media-tests
|
||
This is an alist where the key is a @sc{mime} type, the second element
|
||
is a function to display the part @dfn{inline} (i.e., inside Emacs), and
|
||
the third element is a form to be @code{eval}ed to say whether the part
|
||
can be displayed inline.
|
||
|
||
This variable specifies whether a part @emph{can} be displayed inline,
|
||
and, if so, how to do it. It does not say whether parts are
|
||
@emph{actually} displayed inline.
|
||
@end defopt
|
||
|
||
@defopt mm-inlined-types
|
||
This, on the other hand, says what types are to be displayed inline, if
|
||
they satisfy the conditions set by the variable above. It's a list of
|
||
@sc{mime} media types.
|
||
@end defopt
|
||
|
||
@defopt mm-automatic-display
|
||
This is a list of types that are to be displayed ``automatically'', but
|
||
only if the above variable allows it. That is, only inlinable parts can
|
||
be displayed automatically.
|
||
@end defopt
|
||
|
||
@defopt mm-attachment-override-types
|
||
Some @sc{mime} agents create parts that have a content-disposition of
|
||
@samp{attachment}. This variable allows overriding that disposition and
|
||
displaying the part inline. (Note that the disposition is only
|
||
overridden if we are able to, and want to, display the part inline.)
|
||
@end defopt
|
||
|
||
@defopt mm-discouraged-alternatives
|
||
List of @sc{mime} types that are discouraged when viewing
|
||
@samp{multipart/alternative}. Viewing agents are supposed to view the
|
||
last possible part of a message, as that is supposed to be the richest.
|
||
However, users may prefer other types instead, and this list says what
|
||
types are most unwanted. If, for instance, @samp{text/html} parts are
|
||
very unwanted, and @samp{text/richtech} parts are somewhat unwanted,
|
||
then the value of this variable should be set to:
|
||
|
||
@lisp
|
||
("text/html" "text/richtext")
|
||
@end lisp
|
||
@end defopt
|
||
|
||
@defopt mm-inline-large-images-p
|
||
When displaying inline images that are larger than the window, XEmacs
|
||
does not enable scrolling, which means that you cannot see the whole
|
||
image. To prevent this, the library tries to determine the image size
|
||
before displaying it inline, and if it doesn't fit the window, the
|
||
library will display it externally (e.g. with @samp{ImageMagick} or
|
||
@samp{xv}). Setting this variable to @code{t} disables this check and
|
||
makes the library display all inline images as inline, regardless of
|
||
their size.
|
||
@end defopt
|
||
|
||
@defopt mm-inline-override-p
|
||
@code{mm-inlined-types} may include regular expressions, for example to
|
||
specify that all @samp{text/.*} parts be displayed inline. If a user
|
||
prefers to have a type that matches such a regular expression be treated
|
||
as an attachment, that can be accomplished by setting this variable to a
|
||
list containing that type. For example assuming @code{mm-inlined-types}
|
||
includes @samp{text/.*}, then including @samp{text/html} in this
|
||
variable will cause @samp{text/html} parts to be treated as attachments.
|
||
@end defopt
|
||
|
||
|
||
@node New Viewers
|
||
@section New Viewers
|
||
|
||
Here's an example viewer for displaying @samp{text/enriched} inline:
|
||
|
||
@lisp
|
||
(defun mm-display-enriched-inline (handle)
|
||
(let (text)
|
||
(with-temp-buffer
|
||
(mm-insert-part handle)
|
||
(save-window-excursion
|
||
(enriched-decode (point-min) (point-max))
|
||
(setq text (buffer-string))))
|
||
(mm-insert-inline handle text)))
|
||
@end lisp
|
||
|
||
We see that the function takes a @sc{mime} handle as its parameter. It
|
||
then goes to a temporary buffer, inserts the text of the part, does some
|
||
work on the text, stores the result, goes back to the buffer it was
|
||
called from and inserts the result.
|
||
|
||
The two important helper functions here are @code{mm-insert-part} and
|
||
@code{mm-insert-inline}. The first function inserts the text of the
|
||
handle in the current buffer. It handles charset and/or content
|
||
transfer decoding. The second function just inserts whatever text you
|
||
tell it to insert, but it also sets things up so that the text can be
|
||
``undisplayed' in a convenient manner.
|
||
|
||
|
||
@node Composing
|
||
@chapter Composing
|
||
@cindex Composing
|
||
@cindex MIME Composing
|
||
@cindex MML
|
||
@cindex MIME Meta Language
|
||
|
||
Creating a @sc{mime} message is boring and non-trivial. Therefore, a
|
||
library called @code{mml} has been defined that parses a language called
|
||
MML (@sc{mime} Meta Language) and generates @sc{mime} messages.
|
||
|
||
@findex mml-generate-mime
|
||
The main interface function is @code{mml-generate-mime}. It will
|
||
examine the contents of the current (narrowed-to) buffer and return a
|
||
string containing the @sc{mime} message.
|
||
|
||
@menu
|
||
* Simple MML Example:: An example MML document.
|
||
* MML Definition:: All valid MML elements.
|
||
* Advanced MML Example:: Another example MML document.
|
||
* Charset Translation:: How charsets are mapped from Mule to MIME.
|
||
* Conversion:: Going from @sc{mime} to MML and vice versa.
|
||
@end menu
|
||
|
||
|
||
@node Simple MML Example
|
||
@section Simple MML Example
|
||
|
||
Here's a simple @samp{multipart/alternative}:
|
||
|
||
@example
|
||
<#multipart type=alternative>
|
||
This is a plain text part.
|
||
<#part type=text/enriched>
|
||
<center>This is a centered enriched part</center>
|
||
<#/multipart>
|
||
@end example
|
||
|
||
After running this through @code{mml-generate-mime}, we get this:
|
||
|
||
@example
|
||
Content-Type: multipart/alternative; boundary="=-=-="
|
||
|
||
|
||
--=-=-=
|
||
|
||
|
||
This is a plain text part.
|
||
|
||
--=-=-=
|
||
Content-Type: text/enriched
|
||
|
||
|
||
<center>This is a centered enriched part</center>
|
||
|
||
--=-=-=--
|
||
@end example
|
||
|
||
|
||
@node MML Definition
|
||
@section MML Definition
|
||
|
||
The MML language is very simple. It looks a bit like an SGML
|
||
application, but it's not.
|
||
|
||
The main concept of MML is the @dfn{part}. Each part can be of a
|
||
different type or use a different charset. The way to delineate a part
|
||
is with a @samp{<#part ...>} tag. Multipart parts can be introduced
|
||
with the @samp{<#multipart ...>} tag. Parts are ended by the
|
||
@samp{<#/part>} or @samp{<#/multipart>} tags. Parts started with the
|
||
@samp{<#part ...>} tags are also closed by the next open tag.
|
||
|
||
There's also the @samp{<#external ...>} tag. These introduce
|
||
@samp{external/message-body} parts.
|
||
|
||
Each tag can contain zero or more parameters on the form
|
||
@samp{parameter=value}. The values may be enclosed in quotation marks,
|
||
but that's not necessary unless the value contains white space. So
|
||
@samp{filename=/home/user/#hello$^yes} is perfectly valid.
|
||
|
||
The following parameters have meaning in MML; parameters that have no
|
||
meaning are ignored. The MML parameter names are the same as the
|
||
@sc{mime} parameter names; the things in the parentheses say which
|
||
header it will be used in.
|
||
|
||
@table @samp
|
||
@item type
|
||
The @sc{mime} type of the part (@samp{Content-Type}).
|
||
|
||
@item filename
|
||
Use the contents of the file in the body of the part
|
||
(@samp{Content-Disposition}).
|
||
|
||
@item charset
|
||
The contents of the body of the part are to be encoded in the character
|
||
set specified (@samp{Content-Type}).
|
||
|
||
@item name
|
||
Might be used to suggest a file name if the part is to be saved
|
||
to a file (@samp{Content-Type}).
|
||
|
||
@item disposition
|
||
Valid values are @samp{inline} and @samp{attachment}
|
||
(@samp{Content-Disposition}).
|
||
|
||
@item encoding
|
||
Valid values are @samp{7bit}, @samp{8bit}, @samp{quoted-printable} and
|
||
@samp{base64} (@samp{Content-Transfer-Encoding}).
|
||
|
||
@item description
|
||
A description of the part (@samp{Content-Description}).
|
||
|
||
@item creation-date
|
||
RFC822 date when the part was created (@samp{Content-Disposition}).
|
||
|
||
@item modification-date
|
||
RFC822 date when the part was modified (@samp{Content-Disposition}).
|
||
|
||
@item read-date
|
||
RFC822 date when the part was read (@samp{Content-Disposition}).
|
||
|
||
@item size
|
||
The size (in octets) of the part (@samp{Content-Disposition}).
|
||
|
||
@end table
|
||
|
||
Parameters for @samp{application/octet-stream}:
|
||
|
||
@table @samp
|
||
@item type
|
||
Type of the part; informal---meant for human readers
|
||
(@samp{Content-Type}).
|
||
@end table
|
||
|
||
Parameters for @samp{message/external-body}:
|
||
|
||
@table @samp
|
||
@item access-type
|
||
A word indicating the supported access mechanism by which the file may
|
||
be obtained. Values include @samp{ftp}, @samp{anon-ftp}, @samp{tftp},
|
||
@samp{localfile}, and @samp{mailserver}. (@samp{Content-Type}.)
|
||
|
||
@item expiration
|
||
The RFC822 date after which the file may no longer be fetched.
|
||
(@samp{Content-Type}.)
|
||
|
||
@item size
|
||
The size (in octets) of the file. (@samp{Content-Type}.)
|
||
|
||
@item permission
|
||
Valid values are @samp{read} and @samp{read-write}
|
||
(@samp{Content-Type}).
|
||
|
||
@end table
|
||
|
||
|
||
@node Advanced MML Example
|
||
@section Advanced MML Example
|
||
|
||
Here's a complex multipart message. It's a @samp{multipart/mixed} that
|
||
contains many parts, one of which is a @samp{multipart/alternative}.
|
||
|
||
@example
|
||
<#multipart type=mixed>
|
||
<#part type=image/jpeg filename=~/rms.jpg disposition=inline>
|
||
<#multipart type=alternative>
|
||
This is a plain text part.
|
||
<#part type=text/enriched name=enriched.txt>
|
||
<center>This is a centered enriched part</center>
|
||
<#/multipart>
|
||
This is a new plain text part.
|
||
<#part disposition=attachment>
|
||
This plain text part is an attachment.
|
||
<#/multipart>
|
||
@end example
|
||
|
||
And this is the resulting @sc{mime} message:
|
||
|
||
@example
|
||
Content-Type: multipart/mixed; boundary="=-=-="
|
||
|
||
|
||
--=-=-=
|
||
|
||
|
||
|
||
--=-=-=
|
||
Content-Type: image/jpeg;
|
||
filename="~/rms.jpg"
|
||
Content-Disposition: inline;
|
||
filename="~/rms.jpg"
|
||
Content-Transfer-Encoding: base64
|
||
|
||
/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDAAgGBgcGBQgHBwcJCQgKDBQNDAsLDBkSEw8UHRof
|
||
Hh0aHBwgJC4nICIsIxwcKDcpLDAxNDQ0Hyc5PTgyPC4zNDL/wAALCAAwADABAREA/8QAHwAA
|
||
AQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAAAgEDAwIEAwUFBAQAAAF9AQIDAAQR
|
||
BRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkKFhcYGRolJicoKSo0NTY3ODk6Q0RF
|
||
RkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWGh4iJipKTlJWWl5iZmqKjpKWmp6ip
|
||
qrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl5ufo6erx8vP09fb3+Pn6/9oACAEB
|
||
AAA/AO/rifFHjldNuGsrDa0qcSSHkA+gHrXKw+LtWLrMb+RgTyhbr+HSug07xNqV9fQtZrNI
|
||
AyiaE/NuBPOOOP0rvRNE880KOC8TbXXGCv1FPqjrF4LDR7u5L7SkTFT/ALWOP1xXgTuXfc7E
|
||
sx6nua6rwp4IvvEM8chCxWxOdzn7wz6V9AaB4S07w9p5itow0rDLSY5Pt9K43xO66P4xs71m
|
||
2QXiGCbA4yOVJ9+1aYORkdK434lyNH4ahCnG66VT9Nj15JFbPdX0MS43M4VQf5/yr2vSpLnw
|
||
5ZW8dlCZ8KFXjOPX0/mK6rSPEGt3Angu44fNEReHYNvIH3TzXDeKNO8RX+kSX2ouZkicTIOc
|
||
L+g7E810ulFjpVtv3bwgB3HJyK5L4quY/C9sVxk3ij/xx6850u7t1mtp/wDlpEw3An3Jr3Dw
|
||
34gsbWza4nBlhC5LDsaW6+IFgupQyCF3iHH7gA7c9R9ay7zx6t7aX9jHC4smhfBkGCvHGfrm
|
||
tLQ7hbnRrV1GPkAP1x1/Hr+Ncr8Vzjwrbf8AX6v/AKA9eQRyYlQk8Yx9K6XTNbkgia2ciSIn
|
||
7p5Ga9Atte0LTLKO6it4i7dVRFJDcZ4PvXN+JvEMF9bILVGXJLSZ4zkjivRPDaeX4b08HOTC
|
||
pOffmua+KkbS+GLVUGT9tT/0B68eeIpIFYjB70+OOVXyoOM9+M1eaWeCLzHPyHGO/NVWvJJm
|
||
jQ8KGH1NfQWhXSXmh2c8eArRLwO3HSv/2Q==
|
||
|
||
--=-=-=
|
||
Content-Type: multipart/alternative; boundary="==-=-="
|
||
|
||
|
||
--==-=-=
|
||
|
||
|
||
This is a plain text part.
|
||
|
||
--==-=-=
|
||
Content-Type: text/enriched;
|
||
name="enriched.txt"
|
||
|
||
|
||
<center>This is a centered enriched part</center>
|
||
|
||
--==-=-=--
|
||
|
||
--=-=-=
|
||
|
||
This is a new plain text part.
|
||
|
||
--=-=-=
|
||
Content-Disposition: attachment
|
||
|
||
|
||
This plain text part is an attachment.
|
||
|
||
--=-=-=--
|
||
@end example
|
||
|
||
@node Charset Translation
|
||
@section Charset Translation
|
||
@cindex charsets
|
||
|
||
During translation from MML to @sc{mime}, for each @sc{mime} part which
|
||
has been composed inside Emacs, an appropriate @sc{mime} charset has to
|
||
be chosen.
|
||
|
||
@vindex mail-parse-charset
|
||
@cindex unibyte Emacs
|
||
If you are running a non-Mule XEmacs, or Emacs in unibyte
|
||
mode@footnote{Deprecated!}, this process is simple: if the part
|
||
contains any non-@sc{ascii} (8-bit) characters, the @sc{mime} charset
|
||
given by @code{mail-parse-charset} (a symbol) is used. (Never set this
|
||
variable directly, though. If you want to change the default charset,
|
||
please consult the documentation of the package which you use to process
|
||
@sc{mime} messages. @xref{Various Message Variables, , Various Message
|
||
Variables, message, Message Manual}, for example.) If there are only
|
||
@sc{ascii} characters, the @sc{mime} charset @samp{US-ASCII} is used, of
|
||
course.
|
||
|
||
@cindex multibyte Emacs
|
||
@cindex @code{mime-charset} property
|
||
In a normal (multibyte) Emacs session, a list of coding systems is
|
||
derived that can encode the message part's content and correspond to
|
||
MIME charsets (according to their @code{mime-charset} property). This
|
||
list is according to the normal priority rules and the highest priority
|
||
one is chosen to encode the part. If no such coding system can encode
|
||
the part's contents, they are split into several parts such that each
|
||
can be encoded with an appropriate coding system/@sc{mime}
|
||
charset.@footnote{The part can only be split at line boundaries,
|
||
though---if more than one @sc{mime} charset is required to encode a
|
||
single line, it is not possible to encode the part.} Note that this
|
||
procedure works with any correctly-defined coding systems, not just
|
||
built-in ones. Given a suitably-defined UTF-8 coding system---one
|
||
capable of encoding the Emacs charsets you use---it is not normally
|
||
necessary to split a part by charset.
|
||
|
||
@vindex mm-mime-mule-charset-alist
|
||
@cindex XEmacs/Mule
|
||
It isn't possible to do this properly in XEmacs/Mule. Instead, a list
|
||
of the Mule charsets used in the part is obtained, and the
|
||
corresponding @sc{mime} charsets are determined by lookup in
|
||
@code{mm-mime-mule-charset-alist}. If the list elements all
|
||
correspond to a single @sc{mime} charset, that is used to encode the
|
||
part. Otherwise, the part is split as above.
|
||
|
||
@node Conversion
|
||
@section Conversion
|
||
|
||
@findex mime-to-mml
|
||
A (multipart) @sc{mime} message can be converted to MML with the
|
||
@code{mime-to-mml} function. It works on the message in the current
|
||
buffer, and substitutes MML markup for @sc{mime} boundaries.
|
||
Non-textual parts do not have their contents in the buffer, but instead
|
||
have the contents in separate buffers that are referred to from the MML
|
||
tags.
|
||
|
||
@findex mml-to-mime
|
||
An MML message can be converted back to @sc{mime} by the
|
||
@code{mml-to-mime} function.
|
||
|
||
These functions are in certain senses ``lossy''---you will not get back
|
||
an identical message if you run @sc{mime-to-mml} and then
|
||
@sc{mml-to-mime}. Not only will trivial things like the order of the
|
||
headers differ, but the contents of the headers may also be different.
|
||
For instance, the original message may use base64 encoding on text,
|
||
while @sc{mml-to-mime} may decide to use quoted-printable encoding, and
|
||
so on.
|
||
|
||
In essence, however, these two functions should be the inverse of each
|
||
other. The resulting contents of the message should remain equivalent,
|
||
if not identical.
|
||
|
||
|
||
@node Standards
|
||
@chapter Standards
|
||
|
||
The Emacs @sc{mime} library implements handling of various elements
|
||
according to a (somewhat) large number of RFCs, drafts and standards
|
||
documents. This chapter lists the relevant ones. They can all be
|
||
fetched from @samp{http://quimby.gnus.org/notes/}.
|
||
|
||
@table @dfn
|
||
@item RFC822
|
||
@itemx STD11
|
||
Standard for the Format of ARPA Internet Text Messages.
|
||
|
||
@item RFC1036
|
||
Standard for Interchange of USENET Messages
|
||
|
||
@item RFC1524
|
||
A User Agent Configuration Mechanism For Multimedia Mail Format
|
||
Information
|
||
|
||
@item RFC2045
|
||
Format of Internet Message Bodies
|
||
|
||
@item RFC2046
|
||
Media Types
|
||
|
||
@item RFC2047
|
||
Message Header Extensions for Non-ASCII Text
|
||
|
||
@item RFC2048
|
||
Registration Procedures
|
||
|
||
@item RFC2049
|
||
Conformance Criteria and Examples
|
||
|
||
@item RFC2231
|
||
MIME Parameter Value and Encoded Word Extensions: Character Sets,
|
||
Languages, and Continuations
|
||
|
||
@item RFC1843
|
||
HZ - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and
|
||
ASCII characters
|
||
|
||
@item draft-ietf-drums-msg-fmt-05.txt
|
||
Draft for the successor of RFC822
|
||
|
||
@item RFC2112
|
||
The MIME Multipart/Related Content-type
|
||
|
||
@item RFC1892
|
||
The Multipart/Report Content Type for the Reporting of Mail System
|
||
Administrative Messages
|
||
|
||
@item RFC2183
|
||
Communicating Presentation Information in Internet Messages: The
|
||
Content-Disposition Header Field
|
||
|
||
@end table
|
||
|
||
|
||
@node Index
|
||
@chapter Index
|
||
@printindex cp
|
||
@printindex fn
|
||
|
||
@summarycontents
|
||
@contents
|
||
@bye
|
||
|
||
@c End:
|