6 messages in com.mysql.lists.javaRe: ConnectorJ 3.1 --> java.io.IOExce...
FromSent OnAttachments
Theodosios Paschalidis18 Jan 2004 18:26 
Mark Matthews18 Jan 2004 18:48 
Theodosios Paschalidis19 Jan 2004 11:37 
Mark Matthews19 Jan 2004 11:54 
Theodosios Paschalidis21 Jan 2004 11:45 
Mark Matthews21 Jan 2004 12:12 
Subject:Re: ConnectorJ 3.1 --> java.io.IOException[RESOLVED] Changed to "Unicode JDBC Encoding Problems"
From:Theodosios Paschalidis (theo@hotmail.com)
Date:01/21/2004 11:45:31 AM
List:com.mysql.lists.java

Dear Mark,

I did as you suggested. I wrote a simple test class DriverTest. It creates a table like this statement.execute("CREATE TABLE greekunicode(ID INTEGER NOT NULL AUTO_INCREMENT,UpperCase VARCHAR (30),LowerCase VARCHAR (30),Accented VARCHAR (30),Special VARCHAR (30),PRIMARY KEY(ID)) TYPE = InnoDB, DEFAULT CHARACTER SET utf8;");

I then insert the values using the escape unicode sequence e.g.

String upper = "\u0394\u930F\u039A\u0399\u039C\u0397"; String lower = "\u03B4\u03BF\u03BA\u03B9\u03BC\u03B7"; String accented = "\u03B4\u03CC\u03BA\u03AF\u03BC\u03AE"; String special = "\u037E\u03C2\u03B0";

statement.execute("INSERT INTO greekunicode VALUES ('1','"+upper+"','"+lower+"','"+accented+"','"+special+"');");

my default dos codepage is 1253 (the OEM greek coding that is supported by java). which i get using System.getProperty("file.encoding") Having tested all dos codepages I found that 737 works better so I do PrintWriter sout =new PrintWriter(new OutputStreamWriter(System.out,"CP737"),true); sout.println("UpperCase: "+rs.getString("UpperCase")+" LowerCase: "+rs.getString("LowerCase")+" Accented: "+rs.getString("Accented")+" Special: "+rs.getString("Special"));

which displays most characters correcly. But if I do a SELECT from the MySQL console nothing meaningfull seems to be stored correctly (none of the characters).

Additionally SELECTing executeQuery("SELECT * FROM greeknunicode WHERE UpperCase='"+upper+"'"); throws a SQL exceptionjava.sql.SQLException: Base table or view not found message from se rver: "Table 'test.greeknunicode' doesn't exist"

while the table actually exists. Because of that I cannot even test the retrieval as you initially suggested. Is your aim to find exactly which characters are problematic? I could do an exhaustive test if this is the case but I do not think that ANY of them work. Any suggestions?

I have tested the same code using latin in Unicode and works great through JDBC so it is definately just a Greek unicode handling driver issue. The java SDK I use is the 1.4.2_3 and I run winXPpro (with the service pack).

Any ideas on that would be really apreciated as I need to get this working (already thinking of alternative ways to achieve the same which is a pitty since greek unicode chars are meant to be supported properly in this mysql version).

Thank you in advance, Theo

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Theodosios Paschalidis wrote:

Dear Mark,

thank you very much for your quick answear. Unfortunately the ConnectorJ upgrade changed nothing. I still do not get greek back through JDBC.

As I understand java converts all UTF strings to the default locale thus I changed my regional settings to Greek but still nothing. I suspected that it was the "println(rs.getString())" conversion that went wrong so I used PrintWriter sout =new PrintWriter(new OutputStreamWriter(System.out,"UTF-8"),true); sout.println(rs.getString());

I also tried replacing "UTF-8" with "WINDOWS-65001" (windows utf-8 codepage) threw an Unsupported coding exception. Additionally tried "WINDOWS-1253 (greek ANSI) no exception but no greek.

I use the driver like

"jdbc:mysql://localhost/LOGOSDB?useUnicode=true&characterEncoding=UTF-8");

I think that the only remaining conclusion is that it is a ConnectorJ

issue.

Without the particular code you are using, or the characters you are trying to store or retrieve, I can't help you much. The UTF-8 encoding is tested as part of the testsuite, and we rely on users and customers to point out places where we're missing coverage on a particular character set and JVM/OS combination.

We would appreciate if you can create a standalone testcase that demonstrates the error, that includes creating the table (the simplest possible table would work), and uses the unicode escape sequence to define the characters that are not being stored/retrieved correctly (i.e. \uNNNN).

Given the many vagaries of character encoding issues, sending these characters 'as-is' through e-mail or the web often scrambles what they are, leading to a long period of back-and-forth discussions, which is why we recommend using \uNNNN.

-Mark

- -- Mr. Mark Matthews MySQL AB, Software Development Manager, J2EE and Windows Platforms Office: +1 708 557 2388 www.mysql.com

Want to swim with the dolphins? (April 14-16, 2004) http://www.mysql.com/uc2004/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (MingW32) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFADDYltvXNTca6JD8RAkEXAKDFg0oXkizO0XDeYixZulZt55erPgCdFLjn TeYAm2utRkzx1C0oltkdjmI= =aM6m -----END PGP SIGNATURE-----