

![]() | Start a set with this search |
![]() | Include this search in one of my sets |
![]() | Exclude this search from one of my sets |
![]() | Permalink to these results Paste this link in email or IM: |
| Atom feed for tracking future search results Paste this URL into your reader: |
12 messages in org.w3.public-evangelistRe: japanese encoding nightmare| From | Sent On | Attachments |
|---|---|---|
| Paul Arenson | Nov 12, 2006 5:50 pm | |
| Karl Dubost | Nov 13, 2006 5:21 am | |
| Paul Arenson | Nov 13, 2006 6:25 am | |
| Paul Arenson | Nov 13, 2006 6:58 am | |
| Daniel Barclay | Nov 13, 2006 10:14 am | |
| Mike Schinkel | Nov 13, 2006 3:10 pm | |
| David Dorward | Nov 13, 2006 3:17 pm | |
| Paul Arenson | Nov 13, 2006 6:06 pm | |
| Daniel Barclay | Nov 16, 2006 11:13 am | |
| Richard Ishida | Nov 23, 2006 2:15 am | |
| Tex Texin | Nov 23, 2006 9:03 am | |
| Paul Arenson | Dec 5, 2006 8:07 am |

![]() | Permalink for this message Paste this link in email or IM: |
![]() | Permalink for this thread Paste this link in email or IM: |
| Atom feed for this thread Paste this URL into your reader: |
| Subject: | Re: japanese encoding nightmare | Actions... |
|---|---|---|
| From: | Daniel Barclay (dan...@fgm.com) | |
| Date: | Nov 16, 2006 11:13:15 am | |
| List: | org.w3.public-evangelist | |
Mike Schinkel wrote:
Daniel Barclay wrote:
Remember that <META HTTP-EQUIV="..." ...> elements are not supposed
I should narrow that to "some ... elements "
to be read by the browser when the browser retrieved the document from a server. Such META elements are for the server to read and use to construct real HTTP header fields (if the server chooses that mechanism).
I recently read (from what I remember to be an authoritative source) that in practice servers rarely ever read them because of performance so the browser has to.
In some cases, the browser is not even allowed to use them.
If the server indicates the content type and character encoding ("charset") in the HTTP response, the browser must use _that_ type and charset and must _not_ use values from a <META HTTP-EQUIV="Content-Type" ...> element or anything else in the returned entity (document) to determine the type and charset. That is, the server's HTTP headers override any specifications inside the entity.
A server is supposed to be able to change the encoding of a document as long as it reports the encoding correctly in the Content-Type header. It is not supposed to have to change any <META HTTP-EQUIV="Content-Type" ...> elements.
(Besides requiring any transcoding server to understand HTML, changing such elements would be changing the _contents_ of the document, not just changing its _encoding_ (changing the sequence of characters, not just changing the bytes that encode the characters).)
If the browser ignored the Content-Type header from the server and read a <META HTTP-EQUIV="Content-Type" ...> element, it might be trying to use the wrong encoding.
I thought that any browser that behaved differently (say, IE 6, which sometimes ignores "text/plain" from the server) violated some specification.
However, looking at the HTML 4.01 specification, I only see wording about servers' being allowed to read such element: - "HTTP servers use this attribute to gather information for HTTP response message headers" - "HTTP servers may use the property name specified by the http-equiv attribute to create an [RFC822]-style header in the HTTP response."
Evidently my source was something else. I don't remember which document it was, so I don't know whether it was as authoritative as a specification. (I do think it was something from the W3C.)
Note that XML has similar a rule regarding the character encoding specified inside an XML document in the XML declaration ("<?xml encoding='...'?>"). If the character encoding is specified to the XML processor at a higher level (e.g., via an HTTP Content-Type header), then the processor must ignore the character encoding specification in the XML declaration.
(Again, I can't find that in the XML specification itself, so I can't currently vouch for the authoritativeness of my source.)
Of course, that's all about the content type and encoding. Since I don't recall my source, I can't say whether most HTTP-EQUIV elements are like Content-Type (the browser must _not_ use them) or not (the browser can use them).
This http://www.w3.org/TR/html4/struct/global.html#adef-http-equiv says (emphasis mine): "HTTP servers *MAY* use the property name specified by the http-equiv attribute to create an [RFC822]-style header in the HTTP response." That would imply they might not, and if so the browser would have to handle, no?
Not quite.
It's not a server's not reading HTTP-EQUIV information from inside an HTML document that might imply that the browser should read it.
If the server read more-authoritative information from elsewhere (e.g., a server configuration file describing the documents to be served out) and reported it in an HTTP header, then the browser should not ignore its more-authoritative source (the server HTTP response header) and instead read an less-authoritative source (the insides of the document).
However, it might be a server's not sending a header at all that implies that the browser can (or maybe should) use HTTP-EQUIV information.
(I'm not sure that there's not a case where the server can choose to not return a certain header and where the browser should take that lack of a header as authoritative.)
Daniel







