0
INSERT INTO words (word, meaning) VALUES ('Цветок', 'Flower')

When I execute this SQL statement from an SQL client (DbVisualizer), the values in the table look exactly like you see them here in the statement, i.e. Цветок is not encoded.

When I execute it from PHP over Tomcat 8 / Java version "1.8.0_101" (Quercus 4.0.39) , the values in the table are encoded exactly as if I would run the PHP urlencode() on them. So Flower is not changed, but Цветок is - now it looks in the table through the SQL client like ЦвеÑок

Why is there difference between running the SQL statement from the client or from PHP?

How can I insert this data from PHP so that it will be saved in the database unchanged?

UPDATE

Here is my Tomcat context.xml HSQLDB Resource:

  <Resource name="jdbc/mydb" auth="Container" type="javax.sql.DataSource"
               maxActive="100" maxIdle="30" maxWait="10000"
               username="me" password="kjsfhsjhfsd" driverClassName="org.hsqldb.jdbc.JDBCDriver"
               url="jdbc:hsqldb:hsql://localhost:9001/mydb?characterEncoding=UTF-8"/>

The characterEncoding=UTF-8 is actually MySql specific (so having it breaks this resource!). What is the HSQLDB equivalent? I could not find...

I also tried to set the php.ini with unicode.semantics=on as explained here:

http://www.caucho.com/resin-3.1/doc/quercus.xtp#php.ini

http://www.caucho.com/resin-3.1/doc/quercus.xtp#Internationalization-16-bitunicode

http://www.caucho.com/resin-3.1/doc/quercus.xtp#encoding

However, it did not make any change...

rapt
  • 11,810
  • 35
  • 103
  • 145
  • 1
    What is the collation of the field? – Peter May 17 '17 at 18:42
  • @Peter I have not changed the default settings. – rapt May 17 '17 at 18:57
  • Lookup the code page of your source php file, if it is UNICODE then what you see is double byte chars as single char bytes, that will explain the doubled lenth of the word as it looks in the table – Siyon DP May 17 '17 at 20:06
  • @TSion.D.P `$ file -bi script.php` gives `text/x-php; charset=utf-8`, and I read that HSQLDB also uses this encoding by default http://stackoverflow.com/a/8696754/784980 so it looks like wrong direction – rapt May 17 '17 at 21:11
  • Hmmm.. I just try on the web site I'm working on, typing Cyrilic text in a form and it works. Here are my datas: MariaDB 10.1, fields in utf8mb4_unicode_ci. My PHP pages have . I run PHP7.0.19. I've used two HTML fieds: one input, the other textarea. Notice than in the destination page, before the INSERT I perform a mysqli_real_escape_string on the values. While looking in the db, data are in Cyrilic and then the display is OK (no treatment before dsplaying) – Peter May 17 '17 at 21:25
  • I forget one point: after my mysqli_connect, I do a @mysqli_set_charset($handle,'utf8'); Without this I loose the portugues special char I need for my web site and the Cyrilic don't work. – Peter May 17 '17 at 21:46
  • @Peter my situation is different. I run PHP over Tomcat (with Quercus). And I use HSQLDB. It looks like the problem is how Tomcat speaks to HSQLDB, since directly from the client it works. – rapt May 17 '17 at 23:07

0 Answers0