8

I want to save large attachments (500Mb, maybe, if possible, even > 2Gb) in a database. I know the pros and cons of doing so are discussed very frequently, but that's not the focus of my question.

The conventional way of storing Blob fields in EJB3 with JPA is to use a code as following:

@Lob
private byte[] data;

This becomes a problem when handling huge data sources, because the whole byte array is saved in memory.

I tried to change it to Blob:

@Lob
private Blob data;

But this results in the same problem. When calling

// session: Hibernate session.
// Actually, I'd like to stay with JPA's abstraction and not
// reference Hibernate directly, if possible
Blob data = session.createBlob(inputStream, lengthOfStream);

the createBlob method creates a byteArray from the inputStream.

Because of the ORM mapping, I also wonder how to handle the insertion of the data. One idea was to make an entity variable

@Lob
private byte[] data;

That I will never use. This way, the database schema get's build. But because the @Lob annotation is lazy, it will not bloat my memory.

And then write

entityManager.persist(dataObject);
// The following lines are _completely_ imaginatory!!
query.prepare("update DataObject d set d.data = :dataInputStream where d = :dataObject");
query.setProperty("dataObject", dataObject);
query.setProperty("dataInputStream", someDataInputStream);

One solution I stumbled across looks nice, but doesn't use JPA: Grooviest way to store a BLOB in a database?

Does anybody have an advice on how to do this?

Community
  • 1
  • 1
Thomas Walther
  • 526
  • 1
  • 6
  • 15
  • Is there any reason for you to use a blob in stead of a file path? – siebz0r Jun 06 '12 at 11:51
  • Yes, portability. The idea is to store the data in the database and cache it on the filesystem. This way, only the DB and the jar have to be copied to a new system - the files will get cached from the DB to the filesystem automatically. – Thomas Walther Jun 06 '12 at 15:37
  • I don't think this is an efficient way of storage. If the path of storage is configurable the path in the database only needs to be a postfix of the initial path. The downside is that you need to move the db, jar and files. But I think the pros outweigh the cons here. – siebz0r Jun 06 '12 at 15:52
  • did you ever find a way for your byte[] field? i'm having the same problem. i want to have JPA generate the schema, but not use the field ever. and iirc even with the @Lob annotation JPA retrieved the data from the field – tobi.b Jun 15 '12 at 09:52
  • Already thought of custom types? You could create a type that accepts an inputstream for writing into the db and an outputstream for reading from database. You can write yourself such a converter based on the specs of JPA 2.1 and then a simple wrapper for the orm implementation so you can register that converter. – Christian Beikov Mar 03 '13 at 16:18

1 Answers1

1

@Lob annotation can be applied to a Serializable object. You could then declare :

@Lob
private MySmartLOB data;

and

public class MySmartLOB implements Serializable {
    private void writeObject(java.io.ObjectOutputStream out)
            throws IOException {
        // tranfer data from local storage to 'out'
    }
    private void readObject(java.io.ObjectInputStream in)
            throws IOException, ClassNotFoundException {
        // tranfer data from 'in' to local storage
    }
}

This could work, hoping that the underlying JPA implementation is smart enough to supply an ObjectOutputStream directly piped to the JDBC stream and not some sort of ugly ByteArrayOutputStream.

Kalle Richter
  • 8,008
  • 26
  • 77
  • 177
Carlo Pellegrini
  • 5,656
  • 40
  • 45