I need to import files extracted created on a Windows platform into a MySQL 5.5 database created and running on a Centos server. I develop the scripts on my OSX 10.9 laptop running a MySQL 5.5 database server. The files contain Dutch texts, so they can have accented letters in the names of people.
I need a reliable way to confirm the import is correct on all platforms.
First attempt:
- bash shell script (loadFiles.sh) created that drops database, recreates the db with the provided schema dump and loads the files using
LOAD DATA LOCAL INFILE
- db schema is created using latin1 encoding on OSX
- files are imported as provided
- accented names look good in Sequel Pro on OSX
- running the loadFiles.sh script on the Centos server -> names are garbled
- dumping the OSX version of the database and importing on the Centos server -> names are good
Second attempt:
- db schema is changed to utf8
- loadFiles.sh is expanded to test and convert the files using:
`
file $FILE | grep ISO-8859 >> /dev/null
NEEDCONVERSION=$(echo $1)
if [ "$NEEDCONVERSION == "0"]; then
iconv -f LATIN1 -t UTF8 $FILE > $CONVERTEDFILE
fi
- converted files are imported
- running the script on the Centos server -> all names look good
- inspecting the files on the commandline using less in Centos -> names look fine
- running the script on OSX -> names are garbled
- inspecting the files on the commandline using less in OSX -> names look fine
I've tried:
- different encoding with
iconv
(UTF8-MAC vs UTF8) -> the only difference is that the way the names are garbled is different - I followed the advise in this question (adding
--default-character-set=utf8
to the MySQL import statement and making sure the character collation is correct), but it didn't solve my problem.
Update: when I change the way Sequel Pro views the data on OSX (basically it says the commands below), the names look fine again. I don't need to do this when viewing the data on the Centos server using Sequel Pro.
set character_set_result=latin1;set character_set_client=latin1;
Any help would be greatly appreciated.