12 messages in org.w3.public-evangelistMORE TESTING
FromSent OnAttachments
Paul ArensonNov 12, 2006 5:50 pm 
Karl DubostNov 13, 2006 5:21 am 
Paul ArensonNov 13, 2006 6:25 am 
Paul ArensonNov 13, 2006 6:58 am 
Daniel BarclayNov 13, 2006 10:14 am 
Mike SchinkelNov 13, 2006 3:10 pm 
David DorwardNov 13, 2006 3:17 pm 
Paul ArensonNov 13, 2006 6:06 pm 
Daniel BarclayNov 16, 2006 11:13 am 
Richard IshidaNov 23, 2006 2:15 am 
Tex TexinNov 23, 2006 9:03 am 
Paul ArensonDec 5, 2006 8:07 am 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:MORE TESTINGActions...
From:Paul Arenson (pa@tokyoprogressive.org)
Date:Nov 13, 2006 6:58:07 am
List:org.w3.public-evangelist

I do not think it is the server, because I just took two more files. one was created before and called testz. The other i created now in Mozilla using UFT-8. Called testzz, I uploaded both to two different servers and both came out wrong.

Is my Mozilla corrupted?

http://tokyoprogressive.org/testz.html http://tokyoprogressive.org.uk/testz.html

http://tokyoprogressive.org/testzz.html http://tokyoprogressive.org.uk/testzz.html

Going to bed, it is midnight here. Good night, and thanks.

__/__/__/__/__/__/__/__/__/__/ Paul Arenson

EMAIL pa@tokyoprogressive.org

PHONE &VOICE MAIL 1-617-379-0761 (U.S.) 090-4173-3873 (Japan) paularenson (Skype) __/__/__/__/__/__/__/__/__/__/

On Nov 13, 2006, at 11:40 PM, Greg Swaney wrote:

I did a lot of poking and changing character sets on your account on sunday and it never showed the characters how they were supposed to be shown. What did w3 say?

Paul Arenson wrote:

Hi Greg Further to my Sunday post about files I create in various encodings using Mozilla looking ok on my desktop but not on the server, I wrote to w3.org and they advised me, but it is way over my head. What i am guessing is that files created by Expression Engine output in unicode (UFT-8) and somehow something on the server (database?) tells the server to do something to the encoding. Anyway, when I create a uft encoding on my desktop, it is served different on the site..... I still use Expression Engine, but also use my own pages. Maybe I should contact the guy who set up expression engine for me? I am totally lost....though perhaps it is simple? Thanks! paul see below from the web person--> publ@w3.org <mailto:publ@w3.org> thanks __/__/__/__/__/__/__/__/__/__/ Paul Arenson EMAIL pa@tokyoprogressive.org <mailto:pa@tokyoprogressive.org> PHONE &VOICE MAIL 1-617-379-0761 (U.S.) 090-4173-3873 (Japan) paularenson (Skype) __/__/__/__/__/__/__/__/__/__/ Begin forwarded message:

*Resent-From: *publ@w3.org <mailto:public- evan@w3.org> *From: *Karl Dubost <ka@w3.org <mailto:ka@w3.org>> *Date: *November 13, 2006 10:22:09 PM JST *To: *Paul Arenson <pa@tokyoprogressive.org <mailto:pa@tokyoprogressive.org>> *Cc: *publ@w3.org <mailto:publ@w3.org> *Subject: **Re: japanese encoding nightmare*

Le 13 nov. 2006 à 10:50, Paul Arenson a écrit :

UNSUCCESSFUL EXAMPLE (Looks ok on desktop but not on server) http://tokyoprogressive.org/why.html

CODE <meta content="text/html; charset=UTF-8" http-equiv="content- type">

but this page is not in utf-8 but in shift-jis

Either you have to save your page as utf-8 or to change the encoding information to <META HTTP-EQUIV="Content-Type" CONTENT="text/html;">

SUCCESSFUL EXAMPLE ONE (JAPANESE COMES OUT RIGHT) http://www.tokyoprogressive.org/index/weblog/print/april-entries/

Yes the page is rightly utf-8. not valid but utf-8 http://validator.w3.org/check?uri=http%3A%2F% 2Fwww.tokyoprogressive.org%2Findex%2Fweblog%2Fprint%2Fapril- entries%2F

This was made via EXPRESSION ENGINE

I note I have both xml: lang and uft-8.

xml:lang doesn't influence the display of the page. It is there for example for triggering the right accent when passing the text through a vocal browser. Or to help translation engines (not sure they implement it though). Or to help spelling cheker to choose the right dictionary.

I would recommend that you stick to utf-8, it would help to keep consistency in the way you serve the pages.

A cool plug-in that could be develop and be added to LogValidator. http://www.w3.org/QA/Tools/LogValidator/

Given a list of URIs, create a table with uri server_encoding meta_encoding guessed_encoding

Someone on the list would like to do that? http://www.w3.org/QA/Tools/LogValidator/Manual-Modules

I THOUGHT I did this in UFT-8, but no. Mozilla even says it is UFT-8, but as you can see the code is western. In other words, why does it work?

because so browsers try to display wrong pages (invalid, wrong encoding, etc.) then people who develop Web pages do not know that they have done something wrong, and they do not fix it. IMHO it is a mistake from browsers. It is cool to try to recover and display the page, but it is wrong to do silent recovery, as we do not enter in a cycle which help everyone to fix things and have a better experience.

SUCCESSUL EXAMPLE FOUR (most bizarre?) I even forgot to add the meta tag!!! http://tokyoprogressive.org/

The server is sending by default an information which has usually priority other the information contained in the file. The encoding in a file is a guess, and the browser _should_ follow what the servers says.

Make a page in several encodings http://tokyoprogressive.org/a.html <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html; charset=ISO-2022-JP" LOOKS OK ONLINE

doesn't look ok for me.

but your server is configured in a strange way

GET /a.html HTTP/1.1[CRLF] Host: tokyoprogressive.org[CRLF] Connection: close[CRLF] Accept-Encoding: gzip[CRLF] Accept: text/xml,application/xml,application/xhtml+xml,text/ html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5[CRLF] Accept-Language: fr,en;q=0.9,ja;q=0.9,de;q=0.8,es;q=0.7,it;q=0.7,nl;q=0.6,sv;q=0.5,nb ;q=0.5,da;q=0.4,fi;q=0.3,pt;q=0.3,zh-Hans;q=0.2,zh- Hant;q=0.1,ko;q=0.1[CRLF] Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7[CRLF] User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv: 1.8.0.7) Gecko/20060911 Camino/1.0.3 Web-Sniffer/1.0.24[CRLF] Referer: http://web-sniffer.net/[CRLF] [CRLF]

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7[CRLF]

You serve first iso-8859-1 and then utf-8 and then anything. Maybe one of the sources of your problems is there.

1. Change all your pages in one encoding only. utf-8 2. Change the configuration of your server to send only utf-8.