Is there a way to read a file in a specific character encoding like UTF-16 using PHP's stream wrappers, in the same way I can read a base64-encoded file using php://filter/convert.base64-decode/resource=file.txt
?
-
use this `mb_convert_encoding($content, 'utf-16')` – Faesal Dec 12 '21 at 19:10
-
Great , but I want to use it with wrappers like:php://filter/resource=somefile.php – Ali Saleh Dec 12 '21 at 19:52
1 Answers
PHP strings don't know anything about encodings, so PHP file functions essentially treat every file as a binary file.
If you know that a set of bytes should be read as UTF-16, you can convert it to some other encoding of your choice (here using UTF-8 as an example) using any of these (depending which extensions you have installed):
// Requires ext/iconv; arguments are From, To, String
$utf8_string = iconv('UTF-16', 'UTF-8', $utf16_string);
// Requires ext/mbstring; arguments are String, To, From
$utf8_string = mb_convert_encoding($utf16_string, 'UTF-8', 'UTF-16');
// Requires ext/intl; arguments are String, To, From
$utf8_string = UConverter::transcode($utf16_string, 'UTF-8', 'UTF-16');
Conversely, if you know that the string is in some particular encoding (again, using UTF-8 as an example), and want it to be UTF-16, you would put things in the opposite order:
// Requires ext/iconv; arguments are From, To, String
$utf16_string = iconv('UTF-8', 'UTF-16', $utf8_string);
// Requires ext/mbstring; arguments are String, To, From
$utf16_string = mb_convert_encoding($utf8_string, 'UTF-16', 'UTF-8');
// Requires ext/intl; arguments are String, To, From
$utf16_string = UConverter::transcode($utf8_string, 'UTF-16', 'UTF-8');
In both cases, the resulting string is just a different sequence of bytes; other PHP functions still won't "know" what it "means".
The "iconv" extension also provides a conversion filter which runs the equivalent of the iconv
function as a file or stream is being read. So if you have a file which you know should be read as UTF-16, and want its contents as UTF-8, you could write:
$fp = fopen('php://filter/convert.iconv.utf-16.utf-8/resource=/path/to/utf16-file.txt', 'r');
$first_10_bytes_of_utf16_converted_to_utf8 = fgets($fp, 10);
fclose($fp);
Or the reverse - a UTF-8 file which you want to read as UTF-16:
$fp = fopen('php://filter/convert.iconv.utf-8.utf-16/resource=/path/to/utf8-file.txt', 'r');
$first_10_bytes_of_utf8_converted_to_utf16 = fgets($fp, 10);
fclose($fp);
Again, it's important to remember that PHP is working in bytes, so the fgets
calls above may result in corrupted text because the 10th byte wasn't the end of a Unicode code point.

- 89,526
- 13
- 117
- 169
-
Is there a way to use one of these functions in php wrappers like:php://filter/resource=file.php – Ali Saleh Dec 12 '21 at 19:53
-
I can use it to convert the contents to upper or lower case :php://filter/convert.base64-encode/resource=file.txt – Ali Saleh Dec 12 '21 at 19:56
-
@AliSaleh I hadn't understood from your question that that is what you wanted; I have updated my answer appropriately, and also made some edits to your question to make it easier for future readers to understand. – IMSoP Dec 12 '21 at 20:45
-