glibc - UTF-8 - ISO-8859 question

Bill Freeman f at ke1g.mv.com
Mon Mar 17 18:29:39 EST 2003


	I'm looking at a bug in a piece of open source software
(gpsdrive).  It seems to arrise from the fact that the gtk text input
widget uses ISO-8859, or some other strictly 8 bit character set in
which the degree symbol is 0xB0, while sscanf() seems to be looking
for UTF-8, and stops converting when it hits the degree symbol in the
format string.  Emacs happily displays this character as the degree
symbol, and gcc happily compiles it into the format string of the
sscanf() and elsewhere.  This is with glibc-2.1.3-26, and looking at
the source, is also true in 2.1.3-28, the latest that I found for a
RedHat 6.2 system.

	Aught there to be a version of sscanf() that will accept 8 bit
character sets?  Looking at the implementation fiddling with locales
is not going to fix it on that end (because isascii() justs checks for
the high bit, rather than being syntax table driven).  Should I
escalate this as a bug or feature request to the glibc folks, or do
those of you who have been doing a lot of internationalization have a
suggestion as to the correct way to approach this problem?

							Bill



More information about the gnhlug-discuss mailing list