Ian Bicking: the old part of his blog

Re: The Illusive setdefaultencoding

"Are people claiming that there should be no default encoding?"

No. We're claiming that there should be one fixed default encoding that's used when mixing 8-bit and Unicode strings. And that's how things are, really.

When the Unicode type was added, people disagreed on what the encoding should be (ASCII, ISO-8859-1, or UTF-8), so the setdefaultencoding hook was added so we could play with it. Unfortunately, nobody got around to remove it before the release.

(to me, arguing that it's a good thing that you can use a global setting to control what a+b does when a is an 8-bit string and b is a unicode string is about as silly as arguing that it would be a good thing to have a global setting for controlling what a+b does if a is an integer and b is a string. if you want to convert between different logical types (encoded data and text are different things), use an explicit conversion.)

Comment on The Illusive setdefaultencoding
by Fredrik


Elusive indeed. I just spent the better part of a day trying to figure out why using zipfile.writestr(string) on UTF-8 encoded strings was giving me a UnicodeDecodeError (I'm still relatively new at python). It was actually binascii.crc32(bytes) that was complaining. Since I don't have root access, I can't edit lib/site-packages/sitecustomize.py. I tried putting sys.setdefaultencoding('utf-8') in a file in my working directory. At first, it wouldn't let me access sys.setdefaultencoding, but then I added '.' to my PYTHONPATH and that finally did it. But what happens when I'm zipping up Latin-1 encoded files? I would like to be able to set the default encoding from within my program. I wonder what would be the danger in allowing that? Right now, the only way to do that are the three methods mentioned above. None of these sound satisfactory to me.

# Justin