atom feed3 messages in org.freebsd.freebsd-questionsHow do I get unicode support in python?
FromSent OnAttachments
Eric MesaFeb 8, 2008 6:51 am 
Heiko Wundram (Beenic)Feb 8, 2008 7:05 am 
Eric MesaFeb 9, 2008 7:45 am 
Subject:How do I get unicode support in python?
From:Heiko Wundram (Beenic) (wund@beenic.net)
Date:Feb 8, 2008 7:05:08 am
List:org.freebsd.freebsd-questions

Am Freitag, 8. Februar 2008 15:26:48 schrieb Eric Mesa:

I'm running a web server with FreeBSD 6.1-RELEASE and python 2.4.3. I'm unable to print any characters outside of ascii. I have tried this code on my Linux computer, which has python 2.5.x and it works - so the code is solid.

What do I need to do to get python on the web server to have unicode support? Is there a module/package I need to import in the 2.4 series? Or is there some package/port I need to install? Or do I just recompile python with some different flags? (And does that entail any uninstalling first?)

For Python to be able to "print" unicode characters to the console, it must know the encoding of the console. Generally, this entails setting up LC_ALL and LANG and of course your terminal (emulator) appropriately, and testing whether the interpreter sets the correct encoding on startup (which can be found as sys.getdefaultencoding()). When the encoding that the interpreter uses to "print" _unicode_-strings cannot encode the unicode characters you hand it to the current default encoding, the codec barfs:

[modelnine@phoenix ~]$ python Python 2.5.1 (r251:54863, Nov 6 2007, 19:02:51) [GCC 4.2.1 20070719 [FreeBSD]] on freebsd7 Type "help", "copyright", "credits" or "license" for more information.

import sys sys.getdefaultencoding() 'ascii' print u"\xfa"

Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\xfa' in position 0: ordinal not in range(128)

print u"\xfa".encode("latin-1") ?

Basically, the easiest resolution is to do the conversion yourself (like I did in the second example). The other possibility is to change the deault encoding to something that matches your default console (probably latin-1), which you can do in /usr/local/lib/python2x/site.py.

HTH!