html special characters and web forms
Python
python at venix.com
Sun Oct 9 21:44:00 EDT 2005
On Sun, 2005-10-09 at 19:42 -0400, Donald Leslie {74279} wrote:
> I have a web application which allows text to be input into a form.
>
> A user went to a web page where the text contained a quote , which was
> represented in the page source as ’ .
>>> uc = u'\u8127'
>>> len(uc)
1
>>> bytes = uc.encode('utf8')
>>> len(bytes)
3
>>> map(ord,bytes)
[232, 132, 167] # those are base-10 numbers
Well the UTF encoding uses three bytes. The encoding should be
specified at the top of the html file, though the web server could be
specifying a different encoding in its headers. The browsers normally
believe the web server heading when they disagree.
Are you saying the database field is limited to ASCII (0 < ord < 128)?
Or is it grumbling about the encoding error?
>>> x = '\222'
>>> ux = x.decode('utf8')
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 0:
unexpected code byte
>
>
> When the form data is read the quote becomes \222 . Where do I find
> how this is being encoded so I can put back the quote. or other special
> characters, I found one reference which said that this was due to UTF-8
> encoding .
>
> This is a problem for me since the text containing the quote is stored
> in a data base. An xml database query breaks when it thinks the record
> contains binary data.
>
> Don Leslie
> _______________________________________________
> gnhlug-discuss mailing list
> gnhlug-discuss at mail.gnhlug.org
> http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss
--
Lloyd Kvam
Venix Corp
More information about the gnhlug-discuss
mailing list