3

I have a PHP application who's files encoding is Greek ISO (iso-8859-7). I want to convert the files to utf-8 but simply saving the files with utf-8 isn't enough since the Greek texts get garbled. Is there an "automatic" method to do this so that I can completely convert my app's encoding without having to go through each file and rewrite the texts?

bikey77
  • 6,384
  • 20
  • 60
  • 86

4 Answers4

6

On a Linux system, if you are sure all files are currently encoded in ISO-8859-7, you can do this:

bash> find /your/path -name "*.php" -type f \
    -exec iconv "{}" -f ISO88597 -t UTF8 -o "{}.tmp" \; \
    -exec mv "{}.tmp" "{}" \;

This converts all PHP script files located in /your/path as well as all sub-directories. Remove -name "*.php" to convert all files.


Since you are under Windows, the easiest option would be a PHP script like this:

<?php
$path = realpath('C:\\your\\path');

$iterator = new RecursiveIteratorIterator(
    new RecursiveDirectoryIterator($path), 
    RecursiveIteratorIterator::SELF_FIRST
);

foreach($iterator as $fileName => $file){
    if($file->isFile())
        file_put_contents(
            $fileName,
            iconv('ISO-8859-7', 'UTF-8', file_get_contents($fileName))
        );
}
RandomSeed
  • 29,301
  • 6
  • 52
  • 87
1

Try the iconv function

$new_string = iconv("ISO-8859-7", "UTF-8", $old_string);
cwurtz
  • 3,177
  • 1
  • 15
  • 15
  • This will only convert the contents, I would like to entirely convert the files, including the contents. – bikey77 Apr 08 '14 at 17:49
  • Ah, I read your last sentence as how to automatically convert the data without having to manually retype it. You are going to have to write your own function to transverse your app and update the encoding of your files. If iconv doesn't work for you, try mb_convert_encoding (http://php.net/manual/en/function.mb-convert-encoding.php). Also when you say the texts gets garbled, is that when viewing the file in a text editor?, or when you output contents of the file within PHP? – cwurtz Apr 09 '14 at 14:01
  • No worries. The 2nd. – bikey77 Apr 09 '14 at 14:15
  • Did you send a UTF8 content type header with the output? As well as set the content type to utf8 in the html? – cwurtz Apr 09 '14 at 15:19
  • Yes. The problem resides in the fact that the original app encoding was iso-8859-7, not only the data from the db but the files as well. – bikey77 Apr 09 '14 at 15:49
1
<?php
function writeUTF8File($filename,$content) { 
        $f=fopen($filename,"w"); 
        # Now UTF-8 - Add byte order mark 
        fwrite($f, pack("CCC",0xef,0xbb,0xbf)); 
        fwrite($f,$content); 
        fclose($f); 
}

?>
vimal1083
  • 8,499
  • 6
  • 34
  • 50
0

The following code should work for you, it's a PowerShell script, you can Start > Run > powershell and paste code after modifying required lines.

$sourcepath = "d:\temp\old\"
$targetpath = "d:\temp\new\"
foreach ($file in Get-ChildItem $sourcepath -Filter *.php -Recurse) {
  $content = [System.IO.File]::ReadAllBytes($sourcepath + $file)
  $str = [System.Text.Encoding]::GetEncoding("ISO-8859-7").GetString($content)
  # $str = $str.Replace("ISO-8859-7", "UTF-8")
  [System.IO.File]::WriteAllText($targetpath + $file, $str)
}

You may remove # char on line 6 to make some replacements before saving.

dereli
  • 1,814
  • 15
  • 23