Converting HTML and MIME to plain text mail
Roger H. Goun
roger at bcah.com
Tue Oct 7 15:10:17 EDT 2008
On Tue, Oct 7, 2008 at 12:08 PM, Ben Scott <dragonhawk at gmail.com> wrote:
> * Decode BASE64 or quoted-printable to 7-bit clean plain text
This should be decode to 8-bit clean plain text.
> * Replace any common Unicode characters with ASCII equivilents
> * Replace unhandled non-ASCII characters with an ASCII text representation
These could be merged into:
* Replace non-ASCII characters with an ASCII text representation
But you'll want some default ASCII replacement character such as
space, so you don't have to try to represent everything in the Unicode
charset with ASCII art.
Insert this next:
* Strip all remaining MIME parts that are not text/*.
> * When a plain text body alternative is provided, strip any other body
> alternatives
> * Render HTML to plain text, when only an HTML body is provided
Without actually bothering to check, I think the MIME RFCs also
specify text/rtf as a valid content type. It's rarely seen on the
tubes, so perhaps you want to ignore it for v1.
> * Strip any remaining MIME headers
You probably want the mail to remain a valid MIME message, just in
case the user ever upgrades her MUA. So keep:
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Apologies for any errors - it's been nearly a decade since anyone paid
me to do mail.
-- Roger
More information about the gnhlug-discuss
mailing list