0

I have this command line app that uses Xodus as embedded database. And the program has this bizarre problem.

When run using:

mvn exec:java or using IntelliJ run command, the programs works fine. It can read and write to the Xodus database fine. However, when packaged as jar or native image, both does not work.

So here's what I did, since the packaged jar or native image fails to read/write properly, I copied the database file from the one that was generated from the mvn exec:java and the IntelliJ run command, then did the "export database" command that is available in the command line app.

From there, I compared the CSV output of both programs:

  • The exported CSV from the maven or IntelliJ run, the CSV data is clean.
  • The exported CSV from the jar or native image run are both messy.

I can't really tell how is this happening. I wonder how is this even possible when the premise of Java is portability, not sure if the issue is that I use MacOS on M2 chip. Is there a difference in the runtime when running in maven vs running on java -jar? (btw, other parts of the program that does not use the database works fine, that's why I am assuming that the issue is with the use of Jetbrains Xodus Database)

Any theories are welcome.

quarks
  • 33,478
  • 73
  • 290
  • 513
  • 1
    Well when you run your jar, you've deployed it, so how did you deploy the embedded db? Obviously it can't be *in* the jar as it would not be writable – g00se Jun 27 '23 at 21:36
  • @g00se Xodus is an embedded database, which which adding the Xodus dependencies in the project makes your program able to write to the disk and Xodus manage that directory to store database objects. – quarks Jun 28 '23 at 06:28
  • Yes I know (now). They missed a trick by not providing a task to show their sample code in action. Check their docs about deployment (which I've not read but hope and assume exist). If you want *one* deployment unit only, you will need at least their library, which would have to be `jpackage`d or in a fat jar – g00se Jun 28 '23 at 06:42
  • You could use Maven Shade for the latter. – g00se Jun 28 '23 at 06:49

1 Answers1

0

The problem was not with Xodus, this database is so fast which upon testing it can save 100k records in 72ms and works well with the JVM.

So it's not the problem.

Upon further investigation the code, I found the culprit to be the ClassScanner:

private List<FieldMarshaller<?>> getFieldMarshallers() {
        List<FieldMarshaller<?>> fieldMarshallers = new ArrayList<>();
        String packageName = "com.mycompany.myapp.database.marshallers.fields";
        try {
            List<Class<?>> marshallerClasses = ClassScanner.findClasses(packageName, FieldMarshaller.class);
            for (Class<?> marshallerClass : marshallerClasses) {
                FieldMarshaller<?> marshaller = (FieldMarshaller<?>) marshallerClass.getDeclaredConstructor().newInstance();
                fieldMarshallers.add(marshaller);
            }
        } catch (Exception e) {
            throw new MarshallerException("Error initializing field marshallers", e);
        }
        return fieldMarshallers;
}

The ClassScanner apparently does have behave the same on packaged jars compared to running with maven.

In a typical development setup, the classes are compiled into .class files in a directory structure that reflects the package structure of your code, and classpath scanning works by traversing this directory structure. However, when you package your program into a JAR file, the .class files are bundled into a single file, and the directory structure is no longer directly accessible in the same way.

This is the reason the program is behaving unpredictably and the database dump was like that, the FieldMarshaller which are responsible for marshaling/unmarshaling the Xodus entity to/from typed objects.

Anyhow, for anyone who may be experiencing classpath scanning issues on builds, I used:

    <dependency>
      <groupId>io.github.classgraph</groupId>
      <artifactId>classgraph</artifactId>
      <version>4.8.104</version>
    </dependency>

And was the class scanning works now both when running in debug and also in builds.

Also, since the usage of classgraph is expensive, I made it so it is statically initialized and re-usable to parts of the program:

public class FieldMarshallers {
    private static final List<Class<?>> marshallerClasses;

    static {
        try (ScanResult scanResult = new ClassGraph()
                .whitelistPackages("com.mycompany.myapp.database.marshallers.fields")
                .scan()) {
            marshallerClasses = scanResult.getClassesImplementing(FieldMarshaller.class.getName()).loadClasses(true);
        }
    }

    public static List<Class<?>> getMarshallerClasses() {
        return marshallerClasses;
    }
}
quarks
  • 33,478
  • 73
  • 290
  • 513