3 messages in org.python.python-bugs-list[ python-Bugs-835353 ] logging.Stream...
FromSent OnAttachments
SourceForge.netMar 1, 2004 7:10 am 
SourceForge.netMar 2, 2004 4:22 am 
SourceForge.netMar 2, 2004 8:32 am 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:[ python-Bugs-835353 ] logging.StreamHandler encodes log message in UTF-8Actions...
From:SourceForge.net (nore@sourceforge.net)
Date:Mar 2, 2004 8:32:17 am
List:org.python.python-bugs-list

Bugs item #835353, was opened at 2003-11-03 22:45 Message generated for change (Comment added) made by vsajip You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=835353&group_id=5470

Category: Python Library Group: Python 2.3

Status: Closed Resolution: Invalid

Priority: 5 Submitted By: Vaclav Dvorak (vdvo) Assigned to: Vinay Sajip (vsajip) Summary: logging.StreamHandler encodes log message in UTF-8

Initial Comment: For some reason that I do not see, logging.StreamHandler in Python 2.3 insists on writing plain non-Unicode strings to the stream, and the encoding is hard-coded as UTF-8:

if not hasattr(types, "UnicodeType"): #if no unicode support... self.stream.write("%s\n" % msg) else: try: self.stream.write("%s\n" % msg) except UnicodeError: self.stream.write("%s\n" % msg.encode("UTF-8"))

This behaviour is neither documented nor reasonable. Files can be perfectly able to write Unicode strings (e.g., through the use of codecs.EncodedFile or with a default encoding of sys.stdout), and even if they are not, UTF-8 is hardly the only choice for an encoding. I propose to simply replace the above code with:

self.stream.write(msg) self.stream.write("\n")

----------------------------------------------------------------------

Comment By: Vinay Sajip (vsajip)

Date: 2004-03-02 13:32

Message: Logged In: YES user_id=308438

If you want to use some other encoding, why not use a stream created using codecs.open(), and if necessary use a Formatter which is Unicode-aware to convert from msg + args to the formatted message? Then the exception handler should never be invoked.

Or, do you mean, for the exception handler? I think UTF-8 is OK as the default, since it is the most commonly used. I may consider making this configurable for a future release, if there is enough demand; for now you can patch it yourself.

I'll close this bug report now, I assume that's OK with you?

----------------------------------------------------------------------

Comment By: Vaclav Dvorak (vdvo) Date: 2004-03-02 09:22

Message: Logged In: YES user_id=545628

Hmmm... I can't remember what the exact problem was, but now that I look at it again, I see that it must have been my error. What a poor bug report this is. :-( Sorry.

Still, I'd like the encoding to be configurable: UTF-8 can stay as the default, but it would be nice to have an option to use, say, "iso-8859-2" or "windows-1250".

----------------------------------------------------------------------

Comment By: Vinay Sajip (vsajip) Date: 2004-03-01 12:10

Message: Logged In: YES user_id=308438

Notice that UTF-8 is only used if a UnicodeError is detected. By default, "%s\n" % msg is written to the stream using the stream's write(). If the stream can handle this without raising a UnicodeError, then UTF-8 will not be used. Is there a specific use case/test script which demonstrates a problem?

----------------------------------------------------------------------

Comment By: Martin v. L?wis (loewis) Date: 2003-11-05 20:30

Message: Logged In: YES user_id=21627

That would be an incompatible change, of course, as you then may get encoding errors where you currently get none.

----------------------------------------------------------------------