-1

I have one problem. I have excel file saved as CSV and I need to read that file with PHP and insert into mysql but problem is with char set specifically čćšđž. I tried utf8_encode() and almost everything I could think of.

Examle:

It inserts "Petroviæ" but it should be "Petrović"

EDIT:

<?php

mysql_connect("localhost", "user", "pw");  
mysql_select_db("database");  


$fajl = "Prodajna mreza.csv"; 
$handle = @fopen($fajl, "r");
if ($handle) {
    $size = filesize($fajl);

    if(!$size) {
        echo "File is empty.\n";
        exit;
    }

    $csvcontent = fread($handle,$size);
    $red = 1;

    foreach(explode("\n",$csvcontent) as $line) {
        if(strlen($line) <= 20)
        {
            $red++;
            continue;
        }
        if($red == 1)
        {
            $red++;
            continue;
        }
        $nesto = explode(",", $line);

        if($nesto[0] == '')
            continue;

        mysql_query("INSERT INTO table(val1, val2, val3, val4, val5, val6, val7, val8) VALUES ('".$nesto[0]."','".$nesto[1]."','".$nesto[2]."','".$nesto[3]."','".$nesto[4]."','".$nesto[5]."','".$nesto[6]."','".$nesto[7]."')");  

        $red++;
    }
    fclose($handle);
}
mysql_close();
?>
  • What is your problem right now, the encoding? Did you check the collation of the database tables and the encoding of the file? – Diego Agulló Jul 19 '12 at 13:28
  • @DiegoAgulló tried changing collation but nothing happens. – Rade Josipovic Jul 19 '12 at 13:31
  • There are two aspects here: reading the file and storing its content in the database. Either might cause problems for you. So to avoid any issues, you might want to generate a fixed value in PHP, e.g. as `"Petrovi\xc4\x87"` (assuming an UTF-8 database connection), and see whether you can store that successfully. This should help you narrow things down. Depending on whether that works or not, we'll either need details on how you read the file, or on how you connect to your database. – MvG Jul 20 '12 at 10:54

3 Answers3

1

First off: Using this mysql extension is discouraged. So you might want to switch to something else. Also notice that the way you compose your query by simply pasting strings makes it vulnerable to SQL injection attacks. You should only do this if you are really really sure that there won't be any ugly surprises in the content of the files you read.

It appears that neither your file reading nor the client-side mysql code does anything related to charset conversion, so I'd assume that those simply pass on bytes, without caring about their interpretation. So you only have to make sure that the server interprets those bytes correctly.

Judging from the example you gave, where a ć got turned into an æ, I'd say your file is in ISO-8859-2 but the database is reading it differently, most probably as ISO-8859-1. You should ensure that your database actually can accept all ISO-8859-2 characters for its columns. Read the MySQL manual on character set support and set some suitable default characterset (probably best on the database level) either to utf8 (preferred) or latin2. You might have to recreate your tables for this change to apply.

Next, you should set the character set of the connection to match that of the file. So utf8 is definitely wrong here, and latin2 the way to go.

  • Using your current API, mysql_set_charset("latin2") can be used to accomplish that.
  • That page also describes equivalent approaches for use with other frontends.
  • As an alternative, you can use a query to set this: mysql_query("SET NAMES 'latin2';");

After all this is done, you should also ensure that things are set up correctly for any script which reads from the database. In other words, the charset of the generated HTML must match the character_set_results of the MySQL session. Otherwise it might well be that things are stored correctly in your database but still appear broken when displayed to the user. If you have the choice, I'd say use utf8 in that case, as doing so makes it easier to include different data whenever the need arises.

If some problems remain, you should pinpoint whether they are while reading from file into php, while exchanging data with mysql, or while presenting the result in HTML. The string "Petrovi\xc4\x87" is the utf8 representation of your example, and "Petrovi\xe6" is the latin2 form. You can use these strings to explicitely pass on data with a known encoding, or to check an incoming transferred value against one of these strings.

MvG
  • 57,380
  • 22
  • 148
  • 276
  • Setting mysql_query("SET NAMES 'latin2'"); almost fixed my problem. Now it just removes letters "š" and "ž" from string like they are not in string. Other letters are fine. – Rade Josipovic Jul 22 '12 at 07:31
  • When are they removed, when writing to or when reading from the database? Check that by checking the `LENGTH` of a string with only a single such letter in it. – MvG Jul 22 '12 at 09:15
  • strlen() gives me correct number of letters but it just don't show when I browse db nor when I echo output of db query – Rade Josipovic Jul 22 '12 at 10:55
  • `strlen()` in the PHP generating the HTML result? What `NAMES` did you set in that script? What is the “correct number” you got? If you did `SET NAMES 'utf8'`, then each special character should correspond to two bytes in PHP. You could try `header("Content-Type: text/html; charset=UTF-8")` to ensure that the HTML gets declared UTF-8 as well. If all fails, perhaps open a specific question about this aspect. And don't forget to accept an answer here. – MvG Jul 22 '12 at 17:09
  • Using SET NAMES 'latin2' and header('Content-Type: text/html; charset=latin2'); "Correct number" is number with chars š and ž but it just wont print them. Example: "Triše Kaclerovića 11" and it has len 20 but it prints "Trie Kaclerovića 11" and still gives me length of 20 chars. – Rade Josipovic Jul 22 '12 at 19:16
0

it shouldn't be a problem importing a csv from a csv to a database if both the file and the database collation are utf-8.

<?php
db = @mysql_connect('localhost', 'user', 'pass');
@mysql_select_db('my_database');

$CSVFile = "file.csv";

mysql_query('LOAD DATA LOCAL INFILE "' . $CSVFile . '" INTO TABLE my_table
  FIELDS TERMINATED BY "," LINES TERMINATED BY "\\r\\n";');

mysql_close($db);

?>
kawashita86
  • 1,555
  • 3
  • 18
  • 25
-1

you can add your own. CSV in phpmyadmin...
Import -> format = csv and click on "import"

Or if you don't want use phpmyadmin !

BULK INSERT csv_dump 
    FROM 'c:\file.csv' 
    WITH 
    ( 
        FIELDTERMINATOR = '\t', 
        ROWTERMINATOR = '\n' 
    )
Julien
  • 1,946
  • 3
  • 33
  • 51