Python and UTF-8 text encoding on OSX

Jul 23, '10 07:30:00AM

Contributed by: fcilender

I'm using a MacBook Air with Snow Leopard (10.6.2) as my development platform. And I'm using Python 2.6.4 with Unicode strings.

I received the following error when trying to run a script: UnicodeEncodeError: 'ascii' codec can't encode character u'xe9', indicating that the text encoding was wrong. So it couldn't output a word like appliqué correctly.

I tried adding # -*- coding: utf-8 -*- at the head of my Python script, but I still get this complaint.

To fix this, I found that the text encoding used for standard input, output, and standard error can be specified by setting the PYTHONIOENCODING environment variable before running the interpreter.

The value should be a string in the form <encoding> or <encoding>:<errorhandler>. The encoding part specifies the encoding's name, e.g. utf-8 or latin-1; the optional errorhandler part specifies what to do with characters that can't be handled by the encoding, and should be one of 'error', 'ignore', or 'replace'.

So typing export PYTHONIOENCODING=utf-8 prior to invoking the Python interpreter does the trick, or you could just add this setting to your environment file: ~/.MacOSX/environment.plist.

[crarko adds: I haven't tested this one.]

Comments (4)


Mac OS X Hints
http://hints.macworld.com/article.php?story=20100713130450549