0

I am trying to insert two byte strings into a HANA table with VARBINARY columns, but I keep getting a syntax error, e.g.

SAP DBTech JDBC: [257]: sql syntax error: incorrect syntax near "G\xa2ac\xa0av\xf6": line 1 col 98 (at pos 98)

My two byte strings look like this:

STRING1 = b'G\xa2ac\xa0av\xf6'  
type(STRING1) == <class 'bytes'>
STRING2 = b'708ca7fbb701799bb387f2e50deaca402e8502abe229f705693d2d4f350e1ad6' 
type(STRING2) == <class 'bytes'>

My query to insert the values looks like this:

INSERT INTO testTable VALUES(
CAST(b'708ca7fbb701799bb387f2e50deaca402e8502abe229f705693d2d4f350e1ad6' AS VARBINARY),
CAST(b'G\xa2ac\xa0av\xf6' AS VARBINARY));

I've also tried to do a query how the documentation suggests:

INSERT INTO testTable VALUES(
CAST(x'708ca7fbb701799bb387f2e50deaca402e8502abe229f705693d2d4f350e1ad6' AS VARBINARY),
CAST(x'G\xa2ac\xa0av\xf6' AS VARBINARY));

As well as:

INSERT INTO testTable VALUES(
b'708ca7fbb701799bb387f2e50deaca402e8502abe229f705693d2d4f350e1ad6',
b'G\xa2ac\xa0av\xf6');

But all of these give me some syntax error. Any help would be greatly appreciated. Thanks!

Sandra Rossi
  • 11,934
  • 5
  • 22
  • 48
AOyung
  • 13
  • 4
  • What is the actual error? – Dai Jul 26 '17 at 22:29
  • @Dai The error is like: SAP DBTech JDBC: [257]: sql syntax error: incorrect syntax near "G\xa2ac\xa0av\xf6": line 1 col 98 (at pos 98) – AOyung Jul 26 '17 at 23:18
  • I'm not an SAP HANA expect, but I think you need to escape the backslashes in your filename string. – Dai Jul 27 '17 at 01:49

2 Answers2

0

The problem here lies with your STRING1 value ( b'G\xa2ac\xa0av\xf6' ). It is not a valid hexadecimal string that can represent a binary value in SAP HANA. That's why any type casting will fail here. Instead, it seems that it is actually a string and some of the characters are represented hexadecimal values (UNICODE codepoints maybe?).
At least that's what I make of the \x escpace sequence in the string.

So, you can do different things now.

  1. you can store the string as-is with the escape sequences in the VARBINARY column. To do that, you can use to_binary('G\xa2ac\xa0av\xf6') in the insert statement.
  2. you can convert this string into a valid UNICODE string in your application code and store the data in an NVARCHAR column instead.
Lars Br.
  • 9,949
  • 2
  • 15
  • 29
  • Hey thanks! That helped. If I could ask a follow up ... When I now try to query my data using PyHDB and this select statement: "SELECT col2 FROM testTable WHERE col1=x'708ca7fbb701799bb387f2e50deaca402e8502abe229f705693d2d4f350e1ad6'; " The data I fetch comes out in this form: b'\\xa2ac\\xa0av\\xf6'. Note that the backslashes are escaped. The data type is bytes and I'd like to keep that data type but is it possible to remove the double backslash and just keep it as 1 backslash? Thanks! – AOyung Jul 27 '17 at 17:02
0

As far as I am aware HANA does not understand byte encode like python so I think there is the mix up if you use that representation within the sql console. So in python when printing b'G\xa2ac\xa0av\xf6' a byte that is non presentable in ascii (your local encoding?) is prefixed with \x.

If you want to do that you might first want to convert that to a hex representation in python

>>> import binascii
>>> binascii.hexlify(b'\xa2ac\xa0av\xf6')
b'47a26163a06176f6'

This will give you a uniform representation of your bytearray in hex which you can now use in your SQL console (as HANA Studio and the likes):

INSERT INTO TestTable VALUES(x'47a26163a06176f6');
-- OR
INSERT INTO TestTable VALUES(HEXTOBIN('47a26163a06176f6'));

Note that the prefix b changes to x in the first case to indicate HANA that it should consider this as binary data in hexadecimal representation.

To insert the value from Python 2 as prepared statement:

>>> cursor.execute("INSERT INTO TestTable Values(?)", \
        parameters=[binascii.hexlify(b'G\xa2ac\xa0av\xf6')])

PyHDB seems to expect a string to cope correctly, but in Python 3 hexlify will yield a byte array so you need to turn the result into a string again

>>> param = str(binascii.hexlify(b'G\xa2ac\xa0av\xf6'), 'ascii')
>>> cursor.execute("INSERT INTO TestTable Values(?)", parameters=[param])

I guess this could be considered a bug in PyHDB or at least an inconsistency. Just for completeness sake, in SAP's dbapi client there is a Binary class to wrap bytearrays for this purpose.

Now query that with your client

>>> import pyhdb
>>> con = pyhdb.connect(....)
>>> cursor = con.cursor()
>>> cursor.execute('SELECT * FROM TestTable')
>>> cursor.fetchall()
[(b'G\xa2ac\xa0av\xf6',)]

To sum the entire thing up: b'G\xa2ac\xa0av\xf6' is not a representation HANA understands as such when using it in a SQL statement. We need to find a common ground, for that we converted the bytearray to a hex representation (hexlify) and told HANA to handle it as such (x-prefix / HEXTOBIN).

As Lars Br. mentioned, if those are indeed unicode literals you might want to consider NVARCHAR as datatype.

Goldfishslayer
  • 430
  • 4
  • 10