I have a database storing big blobs, some of them >1GB, behind pg_bouncer with transaction pooling. I frequently get query_wait_timeout errors from pg_bouncer, and I've found the root cause to be the time it takes to read large blobs. Simply, all connections are busy reading the blobs, which takes significantly longer than query_wait_timeout, so other transactions are kept waiting in the meantime.
I came up with this solution of reading by chunks in order to give other (shorter) transactions a chance to execute and resume reading the BLOB later. The code is really simple:
final ByteArrayOutputStream baos = new ByteArrayOutputStream();
do {
tx = beginTransaction();
chunk = blob.getBytes(pos, blobReadChunkLength);
if (chunk.length > 0)
baos.write(chunk);
tx.commit();
pos += chunk.length;
} while (chunk.length > 0);
(Yes, there's Hibernate in the middle, but Hibernate simply returns a JDBC Blob instance, so it's like if it weren't there).
The problem is, if the loop above runs more than once, so that a transaction is committed (or rolled back) and a new one is opened, then getBytes()
throws an exception:
Caused by: org.postgresql.util.PSQLException: ERROR: invalid large-object descriptor: 0
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2510)
at org.postgresql.core.v3.QueryExecutorImpl.receiveFastpathResult(QueryExecutorImpl.java:805)
at org.postgresql.core.v3.QueryExecutorImpl.fastpathCall(QueryExecutorImpl.java:592)
at org.postgresql.fastpath.Fastpath.fastpath(Fastpath.java:105)
at org.postgresql.fastpath.Fastpath.fastpath(Fastpath.java:146)
at org.postgresql.largeobject.LargeObject.seek(LargeObject.java:258)
at org.postgresql.jdbc.AbstractBlobClob.getBytes(AbstractBlobClob.java:108)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.hibernate.engine.jdbc.SerializableBlobProxy.invoke(SerializableBlobProxy.java:72)
at com.sun.proxy.$Proxy27.getBytes(Unknown Source)
This exception doesn't happen if the blob is small enough to be read in only one chunk (or if blobReadChunkLength
is big enough, but that misses the point of this implementation).
I'd like to know. what's the correct way of doing this? Calling free() on the blob before committing the transaction only made things worse.
Is it really impossible to use the same Blob instance in several transactions?
Alternatively, I think I could easily implement this on top of PostgreSQL functions such as lo_read(), at the expense of portability. But I need to get the loid for that. Is it possible to get the loid in JDBC or in Hibernate?