0

i wrote a php script which recieves http POST packages from a windows-client-software.

the windows client uses the "WinHttpClient" for C++.

the WinHttpClient takes the messages i send as wchar_t.

the problem now is, i recieve the messages on my PHP file, but some signs like the "®" sign are shown as "®".

as far as i know, i cannot change the charset of my client to utf-8 or something like that. but maybe some people here know how to "convert" the wide-chars to an utf8 with which PHP can work fine. Because i have to persist the data into an Database which runs only with utf-8.

I tried it with the following, but it doesnt change it :(

function ewchar_to_utf8($matches) {
    $ewchar = $matches[1];
    $binwchar = hexdec($ewchar);
    $wchar = chr(($binwchar >> 8) & 0xFF) . chr(($binwchar) & 0xFF);
    return iconv("unicodebig", "utf-8", $wchar);
}

function special_unicode_to_utf8($str) {
return preg_replace_callback("/\\\u([[:xdigit:]]{4})/i", "ewchar_to_utf8", $str);
}

maybe you got some ideas :) Thanks

Laokoon
  • 1,241
  • 4
  • 24
  • 47

3 Answers3

3

Windows wchar_t is UTF-16LE, so try $u8str = iconv('UTF-16LE', 'UTF-8', $input);

But from what I can see on the WinHttpClient site, it has a _b_str class so you can convert to bytes - it doesn't say if that's via UTF-8, but if all else fails you can use WideCharToMultiByte() with CP_UTF8 codepage to get a suitable byte buffer to POST.

Tino Didriksen
  • 2,215
  • 18
  • 21
1

This is a good PHP Function to use for converting the entire string you get in wchar_t to UTF-8: http://php.net/manual/en/function.mb-convert-encoding.php - Use PHP Info to make sure the version of PHP you have supports MultiByte Strings.

The MultiByte library can help you if you aren't sure of the encoding as well with mb_detect_encoding() or to validate that a string is in a particular encoding using mb_check_encoding().

Jack
  • 1,386
  • 1
  • 8
  • 17
  • My mind was blank for a second, didn't realize it wasn't a multibyte string you were getting! My solution would work within PHP once you converted from Wide Character to Multi Byte like suggested below in cpp. – Jack Jan 10 '13 at 16:52
  • Doesnt Matter. Thanks anyway :-) – Laokoon Jan 11 '13 at 08:43
  • i tried it but doesnt work very well... wchar_t* jsonString; jsonString = ...; /*will be set*/ WideCharToMultiByte( CP_UTF8, 0, jsonString, -1, NULL, 0 , NULL, NULL); now the String is completely destroyed :-( – Laokoon Jan 11 '13 at 10:25
  • Is your wide character string null terminated? You have the cchWideChar parameter set to -1, but it can only be set to -1 if your string is null terminated. Also it looks like you aren't outputting WideCharToMultiByte to anywhere. Set up a buffer to accept the output of the function and set the lpMultiByteStr to the buffer location and cbMultiByte the size of the buffer. The rest of your parameters in WideCharToMultiByte look good. – Jack Jan 11 '13 at 18:41
1

the problem now is, i recieve the messages on my PHP file, but some signs like the "®" sign are shown as "®".

That means you already have UTF-8... misinterpreted in ISO-8859-1/Windows-1252.

If it's like this

<?php

echo $rsymbol; //Comes out as ®

Then all you need to change:

<?php
header("Content-Type: text/html; charset=UTF-8");
echo $rsymbol; //Comes out as ®
Esailija
  • 138,174
  • 23
  • 272
  • 326