0

Im trying to decode UTF-16LE file to UTF-8 problem is I keep getting back kanji and I don't know what might be the cause. Code in question looks as follows

echo("before: ".$line);
$line = iconv('UTF-16LE', 'UTF-8', $line);
// $line = mb_convert_encoding($line, 'UTF-8', 'UTF-16');
echo("after: ".$line);  

where lines are read form file and tried to translate individualy, here is one such file

SŃSKA AZKA
asd ŹĆŻĆĄŚŃ
:61:020102C50,00NTRFNONREF//
PRZEODZĄCY
:86:010<00PRZYCHODZĄCY
<101900200001

when run the program returns this.

When i replace UTF-16LE with UTF-16 output looks a little better but it's still wrong

I have no idea what might be the cause of this. I am certain file is in UTF-16LE as I just created this test file. And when I put it into online UTF-16LE decoders they come out fine.

miken32
  • 42,008
  • 16
  • 111
  • 154
  • What does `bin2hex($line)` look like? Please also include all output in text, not images. – miken32 Jun 09 '21 at 16:30
  • And have you tried `$line = mb_convert_encoding($line, 'UTF-8', 'UTF-16LE');` – miken32 Jun 09 '21 at 16:33
  • @miken32 sorry for the images I didnt know how to put them here without stack trying to interpret the character codes. this is the output of bin2hex($line) `5300430153004b004100200041005a004b0041000d000a 006100730064002000790106017b01060104015a0143010d000a 003a00360031003a003000320030003100300032004300350030002c00300030004e005400520046004e004f004e005200450046002f002f000d000a 00500052005a0045004f0044005a000401430059000d000a 003a00380036003a003000310030003c0030003000500052005a005900430048004f0044005a000401430059000d000a 003c00310030003100390030003000320030003000300030003100` – Krzysztof Kaliszuk Jun 10 '21 at 13:20
  • Output of mb_convert_encoding is: `before: SCSKA AZKA after: SŃSKA AZKA before: asd y{ZC after: 愀猀搀 礀؁笁؁Ё威䌁ഁ਀ before: :61:020102C50,00NTRFNONREF// after: 㨀㘀㄀㨀 ㈀ ㄀ ㈀䌀㔀 Ⰰ  一吀刀䘀一伀一刀䔀䘀⼀⼀ഀ਀ before: PRZEODZCY after: 倀刀娀䔀伀䐀娀Ѐ䌁夀ഀ਀ before: :86:010<00PRZYCHODZCY after: 㨀㠀㘀㨀 ㄀ 㰀  倀刀娀夀䌀䠀伀䐀娀Ѐ䌁夀ഀ਀ before: <101900200001 after: 㰀㄀ ㄀㤀  ㈀    ㄀` – Krzysztof Kaliszuk Jun 10 '21 at 13:21
  • 1
    You can use the edit link to add this information to your question. Use three backticks to output preformatted code, as you did with your PHP code. – miken32 Jun 10 '21 at 15:38

0 Answers0