0

I'm consuming a xml (i get it by a rest service). At one point i decode a CDATA field as:

$version_doc = $this->getSimpleXml($uri);

if($version_doc != false){
        $equipment = utf8_decode((string)$version_doc->equipment);
}

This is an example of the xml field "equipment":

![CDATA[ABARTH: ABS, llantas de aleación de 16'', eléctrico,llantas de aleación de 17" color antracita.]]

After i have seted the $equipment variable, if i save it in mysql (a latin1_spanish_ci collate, latin1 charset table and with Doctrine 1.2) the result in the mysql col for the row is:

ABS, llantas de aleación de 16'', eléctrico,llantas de aleación de 17?? color antracita.

Why i'm getting all the time the ?? symbol?

I'm in PHP5.3 , MySql 5 and running the server in a MAMP environment (MAC)

R01010010
  • 5,670
  • 11
  • 47
  • 77
  • Well... UTF-8 can encode 100,000 characters. ISO-8859-1 a few hundreds. – Álvaro González Mar 27 '13 at 17:35
  • so, a solution could be to change the \uwhatever code to a latin1 character before the decode? – R01010010 Mar 27 '13 at 17:39
  • You didn't understand me. Imagine your XML contains the `€` symbol. It cannot be encoded as [Latin 1](http://en.wikipedia.org/wiki/ISO/IEC_8859-1#Codepage_layout). You have to drop it, there's no other solution, unless you're willing to switch your database and application to UTF-8 (which is indeed the best long-term solution). – Álvaro González Mar 27 '13 at 17:42
  • I know, what i meant is that i could change a € for a e or something similar, not equal. I'm trying this and getting good results for now. I also thought about to change everything to UTF8, it would be nicier... – R01010010 Mar 27 '13 at 17:51
  • Alvaro, so you were right, that was the problem... i made the str_replace's before the utf8_encode and now it works fine! stack overflow should add a feature to pay a beer to people like you. Thank you very much! – R01010010 Mar 27 '13 at 17:56
  • @ÁlvaroG.Vicario [MySQL Latin1 means Windows-1252](http://dev.mysql.com/doc/refman/5.0/en/charset-mysql.html), just like ISO-8859-1 means Windows-1252 in browsers – Esailija Mar 27 '13 at 19:00
  • @Esailija - Funny. I knew they were commonly confused but I had never thought it could slip into MySQL internals (though such kind of errors are to expect when we talk about MySQL, where even UTF-8 is not actually UTF-8). – Álvaro González Mar 29 '13 at 11:23

0 Answers0