atom feed22 messages in org.w3.public-htmlRe: UA support for Content-Dispositio...
FromSent OnAttachments
Julian ReschkeMar 14, 2008 6:48 am 
Lachlan HuntMar 14, 2008 7:42 am 
Julian ReschkeMar 14, 2008 7:50 am 
Julian ReschkeMar 14, 2008 7:54 am 
Lachlan HuntMar 14, 2008 8:01 am 
Julian ReschkeMar 14, 2008 8:17 am 
Michael A. Puls IIMar 14, 2008 9:25 am 
Julian ReschkeMar 14, 2008 9:38 am 
Brian SmithMar 14, 2008 11:45 am 
Julian ReschkeMar 14, 2008 12:04 pm 
Maciej StachowiakMar 15, 2008 10:54 pm 
Julian ReschkeMar 16, 2008 4:02 am 
Maciej StachowiakMar 16, 2008 11:34 am 
Julian ReschkeMar 16, 2008 12:00 pm 
Maciej StachowiakMar 16, 2008 3:46 pm 
Karl DubostMar 16, 2008 10:56 pm 
Leif Halvard SilliMar 17, 2008 11:45 am 
Julian ReschkeMar 17, 2008 2:35 pm 
Brian SmithMar 18, 2008 9:01 am 
Julian ReschkeMar 18, 2008 9:58 am 
Brian SmithMar 21, 2008 9:24 am 
Julian ReschkeMar 21, 2008 5:07 pm 
Subject:Re: UA support for Content-Disposition header (filename parameter)
From:Julian Reschke (juli@gmx.de)
Date:Mar 18, 2008 9:58:01 am
List:org.w3.public-html

Brian Smith wrote:

Using Content-Disposition in HTTP is an ad-hoc solution; it isn't standardized
anywhere. The IE encoding (percent-encoded UTF-8) is not locale-sensitive; in
fact, RFC 2231-based encoding is more sensitive to locale because it allows
arbitrary (non-Unicode) encodings.

But RFC2231 is part of Content-Disposition, see RFC2183, which requires RFC2184, which later was obsoleted by RFC2231.

Furthermore, the IE encoding *is* local-sensitive; if you send percent-encoded UTF-8 to a client that isn't configured for UTF-8 encoded URIs, it doesn't work. At least it didn't when I had to deal with unhappy customers in Asia, and opened a support case.

Finally, using percent-escaped UTF-8 breaks all other clients that do not expect any kind of escaping in this place.

Consider a filename that is 8 letters long, in Thai or any African or Asian
language. The 2231-based encoding is something like this:

Content-Disposition: attachment; filename*0==?UTF-8?Q?=1a=1b=1c=2a=2b=2c=3a=3b=3c=4a=4b=4c=5a=5b=5c=6a=6b=6c=7a=7b=7c=?= filename*1==?UTF-8?Q?8a=8b=8c?=

No, it would be

Content-Disposition: attachment; filename*=utf-8''%1A%1B%1C%2A%2B%2C%3A%3B%3C%4A%4B%4C%5A%5B%5C%6A%6B%6C%7A%7B%7C%8A%8B%8C

Notice that the RFC 2231 encoding *requires* the header to be split into
multiple lines (which many implementations do not handle well). Also notice that
it requires two parameters "filename*1" and "filename*2" to be combined together
to get the actual "filename" parameter.

There is no requirement to fold long lines in HTTP headers, after all it's not MIME.

The right thing to do here would be to mandate just the encoding part of RFC2231; not the line splitting functionality.

The Internet Explorer encoding is this:

Content-Disposition: attachment;
filename="%1A%1B%1C%2A%2B%2C%3A%3B%3C%4A%4B%4C%5A%5B%5C%6A%6B%6C%7A%7B%7C%8A%8B%8C"

The header is more compact, the header can be kept on one line, there is no
header-combining magic going on, and there is no need to deal with any encodings
other than UTF-8.

- there is no need to wrap the filename under RFC2231 either, we're not using MIME

- furthermore, yes, a single encoding is good, so I would recommend to specify exactly that

- the example you give does not work in any browser except IE, and only if it is configured for UTF-8 encoded URIs (which was not the default setting around the world a few years ago).

Also, consider this:

Content-Disposition: attachment; filename*1==?UTF-8?Q?8a=8b=8c?= filename*0==?UTF-8?Q?=1a=1b=1c=2a=2b=2c=3a=3b=3c=4a=4b=4c=5a=5b=5c=6a=6b=6c=7a=7b=7c=?=

That is RFC2047-style encoding mixed with RFC2231 line folding -- I didn't recommend that. It may even be illegal.

This is valid according to RFC 2231 but Firefox and Thunderbird do *NOT* parse
it correctly; they assume the parts of the filename are listed in order. So,
there are no fully conforming HTTP+Content-Disposition+RFC2231 implementations.

That is probably true, thus it would make sense to specify the profile that UAs are expected to implement, and this is exactly the reason why I came here with this issue.

The profile would be:

- no line folding (continuations) - use the encoding from <http://greenbytes.de/tech/webdav/rfc2231.html> with the encoding being hardwired to "utf-8".

Well, Microsoft hasn't implemented RFC2231. What makes you think that they would implement another RFC, when history tells that they just ignore it?

They already implemented the Internet Explorer mechanism in Internet Explorer.
It doesn't work in all configurations.

See. How is this a solution when it works only for a subset of the IE installations?

(Also, look at how unfair that both mechanisms are to users of non-Latin
alphabets. It takes 72 bytes for the Internet Explorer encoding and 113 bytes
for the RFC 2231 encoding, just to encode 8 letters in UTF-8.)

That's only true if you insist on line folding.

Otherwise the overhead is exactly 8 characters compared to what IE allows (not that the users would be really interested in that).

BR, Julian