0

I'm creating and downloading a zip file using /usr/bin/zip in PHP. The problem is that the zip file contains the csv files with the non-ASCII file names. I got the zero-byte file downloaded and the file is not valid.

chdir($tmp_dir); // this is the directory where the files are written into

// CSV files that will be included in the zip file.
// assuming that the file already exist in $tmp_dir
$files = array();
$filename = "ショップ" . date("Ymd") . ".csv"; 
$fpath = $tmp_dir. DS . mb_convert_encoding($filename, "SJIS", "UTF-8");
$files[] = $fpath;

// The zip file to be created
$zip_file = "archive_" . date("Ymd").".zip";
$cmd = "/usr/bin/zip $zip_path *.csv";
exec($cmd);

// Force download
$fpath = $zip_file;
header("Content-Type: application/zip");
header('Content-Disposition: attachment; filename="' . $zip_path . '"');
header('Accept-Ranges: bytes');
if ($this->isIE()) {
  header("Cache-Control:private");
  header("Pragma:private"); 
}
header('Content-Length: ' . filesize($fpath));
readfile($fpath);

I tried ZipArchive, but same problem occurs.

chdir($tmp_dir); // this is the directory where the files are written into

// CSV files that will be included in the zip file.
// assuming that the file already exist in $tmp_dir
$files = array();
$filename = "ショップ" . date("Ymd") . ".csv";
$fpath = $tmp_dir. DS . mb_convert_encoding($filename, "SJIS", "UTF-8");
$files[] = $fpath; 

// The zip file to be created    
$zip_file = "archive_" . date("Ymd").".zip";    
$zip = new ZipArchive();
$zip->open($zip_path, ZipArchive::CREATE);
foreach ($files as $v) {
  $zip->addFile(basename($v));
}
$zip->close();

// Force download
$fpath = $zip_file;
header("Content-Type: application/zip");
header('Content-Disposition: attachment; filename="' . $zip_path . '"');
header('Accept-Ranges: bytes');
if ($this->isIE()) {
  header("Cache-Control:private");
  header("Pragma:private"); 
}
header('Content-Length: ' . filesize($fpath));
readfile($fpath);

Is there any workaround for this? When I removed the Japanese characters from the file name, it is ok.

ndm
  • 59,784
  • 9
  • 71
  • 110
Sithu
  • 4,752
  • 9
  • 64
  • 110

1 Answers1

0

I solved this problem using a function iconv to convert the file names to the proper charset.

$filename = iconv('SJIS', 'CP392//TRANSLIT', "ショップ" . date("Ymd") . ".csv");
$fpath = $tmp_dir. DS . $filename;

This means that it converts the input charset SJIS to the output charset CP392 for the file name. CP392 is a code page for Shift JIS.

Code page 932 (abbreviated as CP932, also known by the IANA name Windows-31J) is Microsoft's extension of Shift JIS to include NEC special characters (Row 13), NEC selection of IBM extensions (Rows 89 to 92), and IBM extensions (Rows 115 to 119). The coded character sets are JIS X0201:1997, JIS X0208:1997, and these extensions. Windows-31J is often mistaken for Shift JIS: while similar, the distinction is significant for computer programmers wishing to avoid mojibake, and a good reason to use the unambiguous UTF-8 instead. The windows-31J name however is IANA's and not recognized by Microsoft, which historically has used shift_jis instead.

Sithu
  • 4,752
  • 9
  • 64
  • 110