0

passing a correctly encoded utf-8 string to a php soapCall and it returns that it is not a valid utf_8 character.

$source = “test”-@– '— " ½ ℗ ↄ⃝ it′s ‘single’
echo mb_detect_encoding( $source, "auto" ) is UTF-8

hex = E2 80 9C 74 65 73 74 E2 80 9D 2D 40 E2 80 93 20 27 E2 80 94 20 22 20 C2 BD 20 20 E2 84 97 20 20 E2 86 84 E2 83 9D 20 20 69 74 E2 80 B2 73 20 E2 80 98 73 69 6E 67 6C 65 E2 80 99

Error: processSalesOrder interupted because of SOAP-ERROR: Encoding: string '\x9c...' is not a valid utf-8 string

returns: �TEST�-@� '� " ½ � �� IT�S �SINGLE�

added 'encoding' => 'UTF-8' to both the soapClient init options and to the soapCall options.

mariadb Database, table, column are utf8mb4, collation utf8mb4_unicode_ci

php.ini default_charset = "UTF-8"

database connection is set to utf8mb4 html page set to charset=utf8

Displays correctly on webpage. encoding detect shows UTF-8 right before sending to soapCall, but is rejected by soapCall.

What am I missing?

EDIT: I also used suggestion from PHP DOMDocument::save() saves as ASCII instead of UTF-8 but it didn't help.

$xmlDomDoc->encoding = 'UTF-8';

I finally resolved the issue changing settings in php.ini for [iconv] and [mbstring] to UTF-8

iconv.input_encoding = UTF-8

iconv.internal_encoding = UTF-8

iconv.output_encoding = UTF-8

mbstring.internal_encoding = UTF-8

mbstring.http_input = UTF-8

Community
  • 1
  • 1
stuartz
  • 121
  • 4
  • 9

1 Answers1

0

When trying to use utf8/utf8mb4, if you see Black Diamonds with question marks, one of these cases exists:

Case 1 (original bytes were not utf8):

  • The bytes to be stored are not encoded as utf8. Fix this.
  • The connection (or SET NAMES) for the INSERT and the SELECT were not utf8/utf8mb4. Fix this.
  • Also, check that the column in the database is CHARACTER SET utf8 (or utf8mb4).

Case 2 (original bytes were utf8):

  • The connection (or SET NAMES) for the SELECT was not utf8/utf8mb4. Fix this.
  • Also, check that the column in the database is CHARACTER SET utf8 (or utf8mb4).

Black diamonds occur only when the browser is set to <meta charset=UTF-8>

Rick James
  • 135,179
  • 13
  • 127
  • 222