Character set wars (was: In defense of Google)

Ben Scott dragonhawk at gmail.com
Mon Jan 30 22:43:00 EST 2006


On 1/30/06, Michael ODonnell <michael.odonnell at comcast.net> wrote:
> ... it could also be argued that it's weird and unconventional
> for your mailer to be specifying your messages as requiring
> a UTF-8 font.

  UTF-8 isn't a font, it's a character set -- just like ASCII is a
character set.  Indeed, for the characters that ASCII provides, UTF-8
is indistinguishable from ASCII.  (Literally -- it's binary identical
until you go above codepoint 127.)  If specifying UTF-8 is really what
is causing your mailer grief, then I would have to say it really is
your mailer (or maybe your system's overall support for Unicode)
that's hosed.

  If you want to argue "just use ASCII", I'd again point out: UTF-8 is
binary identical to ASCII below codepoint 128.  Your mailer really was
"just using ASCII", you would not be having any trouble.  Your system
must be trying to do more then "just ASCII", and screwing up when it
does so.

  As far as "unconventional" goes... well, believe me, I pine (no pun
intended) for the nice, safe old world of 7-bit ASCII as much as the
next guy, but there are over six billion people on this planet, and
most of them don't use the Latin character set.  Unicode is coming --
nay, it is already here.  It's going to make things really interesting
in the computer world for the next several years.  Y2K ain't got
nothing on this.

-- Ben



More information about the gnhlug-discuss mailing list