2

Why would a unicode encoding get lost when sending it to an ODBC-interface via the dbi-gem?

We wrote an sinatra-application that connects via DBI to an SAP HANA database. The latter shouldn't matter as I can manually execute Statements with UTF8 within SAP HANA Studio.

My code looks simplified like this:

# encoding: UTF-8
require 'dbi'

target = 'Sch\u00f6nefeld'  # sample for sanitized user input. Definitely UTF-8.
# printed version: target='Schönefeld'

dbh = DBI.connect 'DBI:ODBC:DB_NAME', 'username', 'password'
sth = dbh.prepare 'INSERT INTO tab VALUES (0, ?)'

sth.execute(target)
# [...]
sth.finish

Now when I check the database, the new Entry is: (Two random garbage characters)

| 0 | Sch$#nefeld |

If I would print the statement to my command line before executing and execute it manually, the database contains exactly what I want:

| 0 | Schönefeld |

Is there a way to ensure my string sent to the DB stays encoded in UTF-8?
Should I change the encoding (e.g. to UTF-16)?
Could this be an issue of the DBI gem or the ODBC driver?

friedrich
  • 2,095
  • 1
  • 11
  • 12
  • Did you solve this? I have a similar problem when I get values from the DB. I can't figure out how to change the way DBI handles the encoding. – izaban Jul 08 '15 at 14:57
  • Sorry, I didn't find a good/general way. My solution was to work with IDs only. That way, I deferred the Unicode-part until it's inserted (INSERT-statement contained a SELECT). This is only possible if you know all possible inputs. – friedrich Jul 09 '15 at 18:38

0 Answers0