3 messages in com.mysql.lists.mysqlRE: utf8 importing problem
FromSent OnAttachments
pyt...@hope.cz27 Oct 2006 23:21 
Jerry Schwartz30 Oct 2006 07:36 
Jerry Schwartz30 Oct 2006 13:09 
Subject:RE: utf8 importing problem
From:Jerry Schwartz (jsch@the-infoshop.com)
Date:10/30/2006 01:09:25 PM
List:com.mysql.lists.mysql

Remember that my MySQL skills are at the beginner level; and this whole Unicode / utf8 business always gives me a headache.

Any Unicode or utf8 characters with diacritical marks will look funky in DOS.

Normally what I do is take my data, convert it from uft8 to utf8, and see if the results match what I originally had. I'm not sure how reliable this technique is, but it will at least recognize files which have characters that are not utf8-encoded. This is counting on the fact that PHP doesn't recognize this as a "null" conversion.

My particular problem was that I had files that were created in Windows applications, using CP1252 encoding. I needed to get these into utf8, and wanted to test my results. I have a PHP script to do this. My translator seems to work: at least, my results look right after the translation. What disturbs me is that the translated (utf8) files also seem to be CP1252, which seems counter-intuitive. I did this awhile ago, so it may be that CP1252 has alternate encodings that are a superset of utf8 (and I've forgotten).

How are you looking at the data? Try directing some of it from MySQL into a text file, and open the text file with Notepad. It will give you a choice of opening the file as ANSI, Unicode, or utf8. Try opening the file in Unicode: if it is not Unicode, then the letters with diacritical marks should look wrong (or be missing).

If MySQL's engine is as blind as PHP, you can do this in SQL. The problem, as I see it, is that the engine already believes the data is UTF8 so it might not work.

SELECT COUNT(*) FROM table1 WHERE CONVERT(field1 USING utf8) != field1;

Regards,

860.674.8796 / FAX: 860.674.8341

-----Original Message----- From: pyt@hope.cz [mailto:pyt@hope.cz] Sent: Monday, October 30, 2006 1:05 PM To: Jerry Schwartz Subject: RE: utf8 importing problem

Jerry, I checked the imported data ( sql file) and the data are in utf8 coding. Is there a way how to check the imported data in a table itself, to eliminate the problem in the application?

Thank you for your reply L.

Most likely the UTF8 is still in the data base, but whatever program you are using to view it is not displaying UTF8 properly. MySQL's command line program will not, for example, even if you SET NAMES "utf8".

Regards,

860.674.8796 / FAX: 860.674.8341

-----Original Message----- From: pyt@hope.cz [mailto:pyt@hope.cz] Sent: Saturday, October 28, 2006 2:22 AM To: mys@lists.mysql.com Subject: utf8 importing problem

I use MySQL database with utf8 character set and utf8_czech_ci collation. It works well on Linux server but when I try to export the data and import into the same database but running on XP machine the utf8 is gone.Instead of a proper coding there are some strange characters.

I used mysqldump --default-character-set=utf8 mimi >/home/Result.sql to export data to /home/Result.sql file on Linux machine. Then I downloaded the file to my XP and here I used mysql --default-character-set=utf8 mimi < Result.sql to import data. Is it correct?

Any help would be appreciated

------- End of forwarded message -------