0

I'm trying to use PHP to read a Windows folder where the folder content contains files with Spanish names (for example Español.doc)

However the filenames print out incorrectly, "Espan??ol.doc" in the above case.

The function mb_detect_encoding($file) returns ASCII but somehow the ñ is not displayed. Is there a quick fix for this?

I am using PHP 5.4.16, Windows 7 Home Premium Edition Service Pack 1, Apache/2.4.4 and (Win32) OpenSSL/0.9.8y.

halfer
  • 19,824
  • 17
  • 99
  • 186
iPad Man
  • 50
  • 1
  • 10
  • In relation to the (now deleted) answer from Shankar Damodaran, maybe it did work, but the place you saw that error message does not support UTF-8? Are you in a web or a console environment? – halfer Apr 05 '14 at 13:39
  • I'm using php to read a folder and echo out its contents. Unfortunately some of the filenames (and subfoldernames) contain Spanish characters which appear as question marks. This thread seems to indicate that PHP can't read UTF-8 characters. http://stackoverflow.com/questions/708017/can-a-php-file-name-or-a-dir-in-its-full-path-have-utf-8-characters?lq=1 – iPad Man Apr 05 '14 at 13:45
  • Bear in mind that post is five years old - UTF-8 support will have changed dramatically since then. Try echoing out "Español" in your web page - perhaps UTF-8 filename support is fine, but the aforementioned error message is not rendering correctly because of your page's character set. – halfer Apr 05 '14 at 13:53
  • @halfer Unfortunately, I think it's still a problem. I tried messing around with and setting header('Content-Type:text/html; charset=UTF-8'); as well as 3 different browsers and changing the encoding but could not get it to work. I also tried the settings in httpd.conf and php.ini mb_detect_encoding() reports the string returned as ASCII but I think it's not a full UTF-8 – iPad Man Apr 05 '14 at 13:56
  • echo "Español"; displays Español correctly. Ole! – iPad Man Apr 05 '14 at 14:05
  • [This question](http://stackoverflow.com/questions/1525830/how-do-i-use-filesystem-functions-in-php-using-utf-8-strings) is relevant, but also old. – halfer Apr 05 '14 at 14:18

2 Answers2

0

Try converting to filename to cp1252 like this:

if (file_exists(iconv('utf-8', 'cp1252', $utffilename)));
ek9
  • 3,392
  • 5
  • 23
  • 34
0

Here is something I've tried on 5.3.x/Ubuntu, in a console environment:

<?php

$file = 'Español.doc';
echo file_get_contents($file);

The file contains the word "Hello", and it prints to the screen fine. Thus, I think it is safe to say that even older versions of PHP support UTF-8 file names.

Could the problem be that PHP on Windows behaves differently? Try this in a console too.

Also, check with your browser to see what rendering mode it is using. For Firefox, use View Page Info and check the Encoding in the General tab.

halfer
  • 19,824
  • 17
  • 99
  • 186
  • I think it's a Windows issue with the way filenames are stored in UTF-16 and how PHP file functions operate. I read somewhere that I could get the UTF-8 filenames by accessing the FAT tables. Still researching that. I tried Safari, Firefox and Chrome with UTF-8 (and other) encodings with no joy. – iPad Man Apr 05 '14 at 14:13
  • If you get a filename from Windows (e.g. using `glob` or a directory iterator) then maybe use `iconv` to convert from UTF-16 to UTF-8? – halfer Apr 05 '14 at 14:19
  • I used the opendir and readdir to get the folder/file names and tried iconv and both mb_convert_encoding($file,'utf-8','ISO-8859-1'); mb_convert_encoding($file,'utf-8','utf-16'); no joy :-( – iPad Man Apr 05 '14 at 14:29
  • @iPad: OK, please update your question with your new attempts and new findings - maybe that will attract more answers. – halfer Apr 05 '14 at 14:33
  • Good idea! Also, stating the obvious: can you switch away from Windows? – halfer Apr 05 '14 at 15:25