0

When I select bytea fields filled with binary data in my Postgres database using the HDBC-postgreSQL driver (version 2.3.2.3), they come out as:

SqlByteString "\x<hex representation of binary data>"

That is, it returns a ByteString which contains a string containing \x followed by the hex representation of my binary data. This is inconvenient, dreadfully inefficient and basically makes no sense to me.

Is there any reason why it doesn't return a SqlByteString containing a byte string with the actual binary data inside it? Is there something I am missing, or how do I configure the driver to do that?

thanks

Thomas
  • 3,321
  • 1
  • 21
  • 44
  • 1
    What you are seeing is an actual byte string, raw bytes and all. It is simply the case that when a bytestring is printed, it is printed as a "string" and non-printing and non-ascii characters are escaped. In short, it is the result of calling `show` on the `ByteString`. If the `ByteString` actually had a quoted string inside it, it would look like `"\"\\x1234\""`. – user2407038 Mar 17 '16 at 10:54
  • 1
    @user2407038 I suspected that at first as well but actually `fromSql` returns me a bytestring of length 34 (2 + 16 * 2) with that literal content :( – Thomas Mar 18 '16 at 03:57

2 Answers2

1

This is a longtime known issue with this library. See this bug for example.

The broader problem is that getting raw bytes requires a fair amount of cleverness that the postgres api doesn't make obvious. You have to call your entire query with binary rather than text output (which is arguably better anyway, but would require a rewrite of that portion of the binding layer).

You can see where pqexecparams is called and note it is called with a last parameter of 0, which by the postgres docs means everything comes back in text. And for postgres that means this funny hex representation you see.

If that argument were swapped to 1, then things could be done more efficiently (including getting raw binary for bytea fields) but Statement.hsc would have to be pervasively rewritten to deserialize those binary values.

This is one of those things where its slightly irritating to lots of people, but nobody has yet been sufficiently motivated to go rewrite and debug the whole thing. But, of course, somebody really should! :-)

sclv
  • 38,665
  • 7
  • 99
  • 204
  • Thanks for the answer... a bit disappointing, but it's okay, I can see the difficulty if the entire query needs to be either text or binary. I guess I will just make a utility function to convert as needed and deal with the small performance hit until this is fixed. – Thomas Mar 18 '16 at 07:42
1

I've solved this going to postgresql.conf and adding:

bytea_output = 'escape' #by default, it's 'hex'

What this dose is that it can now retrieve data exactly the way you inserted it because the output is no longer encoded.

I know it's late, I'm also new to Stack Overflow, but I taught that this information is important.

Patrix
  • 61
  • 1
  • 6