UCanAccess seems unable to read OLE Object column using getBytes()

Question

I have a quite large .mdb access database I want to convert to SQLite3 to use it under Linux.

I am unable to transfer any of the BLOBs (mostly containing images) I have. Here is a sample test program:

import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

public class prova {
        public static void main(String[] args) {
                String url = "jdbc:ucanaccess://data/BookDB-201810.mdb";
                try {
                        Connection c = DriverManager.getConnection(url);
                        PreparedStatement ps;
                        ResultSet rs;
                        String q = "SELECT * FROM PersonImage";
                        ps = c.prepareStatement(q);
                        rs = ps.executeQuery();
                        while (rs.next()) {
                                byte[] i = rs.getBytes("Image");
                                String fn = String.format("data/img/i%05d.%d.jpg", rs.getInt("PersonId"), rs.getInt("Index"));
                                try (FileOutputStream fos = new FileOutputStream(fn)) {
                                        fos.write(i);
                                } catch (FileNotFoundException e) {
                                        e.printStackTrace();
                                } catch (IOException e) {
                                        e.printStackTrace();
                                }
                        }
                } catch (SQLException e) {
                        e.printStackTrace();
                }
        }
}

Program runs without errors, but the produced files are "strange" (surely not images):

$ ls -l i00072.1.jpg 
-rw-r--r-- 1 mcon mcon 369 Nov 23 11:38 i00072.1.jpg
$ file i00072.1.jpg 
i00072.1.jpg: Java serialization data, version 5

Looking int them I find this:

....sr..net.ucanaccess.jdbc.BlobKey...........L.
columnNamet..Ljava/lang/String;L..keyt..Ljava/util/HashMap;L.   tableNameq.~..xpt..Imagesr..java.util.HashMap......`....F.
loadFactorI.    thresholdxp?@......w.........t..PersonIDsr..java.lang.Integer.⠤...8...I..valuexr..java.lang.Number...........xp...
Ht..Indexsr..java.lang.ShorthM7.4`.R...S..valuexq.~.  ..xt..PersonImage

What am I doing wrong?

Update: Since my aim is to convert a book database (maintained on the net as .mdb using the obsolete program BookCAT for historical reasons) I found AccessConverter which seems to fit the bill; unfortunately there are essentially two problems with that:

In my database some columns are named "Index", which is a reserved word for SQLite (and others, but apparently not for MS-Access); this is trivially fixed inserting a filter "Index" -> "Idx".
There is a translation switch (type) { case xxx: ... to convert the various MS-Access types to more conventional SQLite3 types; breakpointing on the default: label I discovered (as @ErikvonAsmuth correctly guessed) there are unhandled OLE types. I have no idea about how to handle those.

In the program (BookCAT) those fields contain two kind of data:

images
formatted text

The latter is less important because there's always a duplicate "plaintext" version (it would be nice to retrieve also the formatted version, but...).

I would really like to be able to extract the images, though.

In image data there is a companion "ImageType" column invariably set to "2" which (if I'm not mistaken) means they should be .jpeg images.

What can I do to retrieve OLE data in a usable format?

Note: AccessConverter does not use ucanaccess, it uses directly the underlying com.healthmarketscience.jackcess lib instead.

Note2: it seems BookCAT is built using Delphi, if that is relevant.

There's no such thing as a `VARBINARY` in Access. There are OLE objects, but those come in many types, some just containing a file as binary data, others containing specific headers, and others compressing files and putting multiple files in a single row. Unless you specify exactly what you're trying to import, we likely won't be able to help you. — Erik A, Nov 23 '18 at 23:53
@ErikvonAsmuth: just a few minutes ago I discovered You are absolutely right. I will update Question to reflect my findings and refine question (short summary: those fields contain either images or "formatted text"). — ZioByte, Nov 24 '18 at 01:03

score 2 · Accepted Answer · answered Nov 25 '18 at 17:08

It turns out that, in my specific case, all "OLE" fields are actually BLOBs, unrecognizable as OLE2 objects.

In this condition jackcess returns a Enum OleBlob.ContentType.UNKNOWN Type and, in this case, it will refuse to access BLOB content (OleBlob.content.getBytes() returns null).

In order to access stored data is necessary to use directly Column.getBytes(name) (completely bypassing OLE subsystem).

Why, in this condition, ucanaccess returns an invalid value, instead of failure, is beyond me (probably I should file a bug report; comments welcome).

Image data is plain jpeg-formatted file, while the "formatted text" seems some custom serialization of Delphi TRichText widget I don't know how to parse, but that is another issue.

Using jackcess Column.getBytes(name) I was able to retrieve the data I needed.

Gord Thompson · Answer 2 · 2018-11-26T16:14:54.980

With UCanAccess you need to use ResultSet#getBlob:

String q = "SELECT * FROM PersonImage";
ps = conn.prepareStatement(q);
rs = ps.executeQuery();
while (rs.next()) {
    java.sql.Blob image = rs.getBlob("Image");
    String fn = String.format("C:/Users/Gord/Pictures/i%05d.%d.jpg", rs.getInt("PersonId"), rs.getInt("Index"));
    try (FileOutputStream fos = new FileOutputStream(fn)) {
            fos.write(image.getBytes(1, (int) image.length()));
    } catch (FileNotFoundException e) {
            e.printStackTrace();
    } catch (IOException e) {
            e.printStackTrace();
    }
}

score 0 · Answer 3 · answered Nov 23 '21 at 14:28

We were able to access the content using UCanaccess + accdb file type:

String columnLabel = "FIELD1"; 
Object obj = rs.getObject(2);
net.ucanaccess.complex.Attachment[] attachments = (net.ucanaccess.complex.Attachment[])obj;
net.ucanaccess.complex.Attachment att = attachments[0];
System.err.println("Attachment Name: " + att.getName());
byte[] bytea = att.getData();

As you can see the attachments is an array and it can hold more than 1 attachment.

UCanAccess seems unable to read OLE Object column using getBytes()

3 Answers3