1

I need to copy all of the contents of a stream of VectorSchemaRoots into a single object:

Stream<VectorSchemaRoot> data = fetchStream();

VectorSchemaRoot finalResult = VectorSchemaRoot.create(schema, allocator);
VectorLoader = new VectorLoader(finalResult);

data.forEach(current -> {
    VectorUnloader unloader = new VectorUnloader(current);
    ArrowRecordBatch batch = unloader.getRecordBatch();
    loader.load(batch);
    current.close();
})

However, I am getting the following error:

java.lang.IllegalStateException: Memory was leaked by query. Memory was leaked.

Also getting this further down the stack track:

Could not load buffers for field date: Timetamp(MILLISECOND, null) not null. error message: A buffer can only be associated between two allocators that shame the same root

I use the same allocator for everything, does anyone know why I am getting this issue?

Pablo
  • 1,302
  • 1
  • 16
  • 35

2 Answers2

1

The "leak" is probably just a side effect of the exception, because the code as written is not exception-safe. Use try-with-resources to manage the ArrowRecordBatch instead of manually calling close():

try (ArrowRecordBatch batch = unloader.getRecordBatch()) {
    loader.load(batch);
}

(though, depending on what load does, this may not be enough).

I can't say much else about why you're getting the exception without seeing more code and the full stack trace.

li.davidm
  • 11,736
  • 4
  • 29
  • 31
  • I am still getting the same errors posted above, when I comment out the loader.load line I get no errors – Pablo Aug 22 '22 at 14:09
1

Could you try with something like this:

import org.apache.arrow.memory.BufferAllocator;
import org.apache.arrow.memory.RootAllocator;
import org.apache.arrow.vector.IntVector;
import org.apache.arrow.vector.VectorLoader;
import org.apache.arrow.vector.VectorSchemaRoot;
import org.apache.arrow.vector.VectorUnloader;
import org.apache.arrow.vector.ipc.message.ArrowRecordBatch;
import org.apache.arrow.vector.types.pojo.ArrowType;
import org.apache.arrow.vector.types.pojo.Field;
import org.apache.arrow.vector.types.pojo.FieldType;
import org.apache.arrow.vector.types.pojo.Schema;

import java.util.Arrays;
import java.util.Collections;
import java.util.stream.Stream;

public class StackOverFlowSolved {
    public static void main(String[] args) {
        try(BufferAllocator allocator = new RootAllocator()){
            // load data
            IntVector ageColumn = new IntVector("age", allocator);
            ageColumn.allocateNew();
            ageColumn.set(0, 1);
            ageColumn.set(1, 2);
            ageColumn.set(2, 3);
            ageColumn.setValueCount(3);
            Stream<VectorSchemaRoot> streamOfVSR = Collections.singletonList(VectorSchemaRoot.of(ageColumn)).stream();

            // transfer data
            streamOfVSR.forEach(current -> {
                Field ageLoad = new Field("age",
                        FieldType.nullable(new ArrowType.Int(32, true)), null);
                Schema schema = new Schema(Arrays.asList(ageLoad));
                try (VectorSchemaRoot root = VectorSchemaRoot.create(schema,
                             allocator.newChildAllocator("loaddata", 0, Integer.MAX_VALUE))) {
                    VectorUnloader unload = new VectorUnloader(current);
                    try (ArrowRecordBatch recordBatch = unload.getRecordBatch()) {
                        VectorLoader loader = new VectorLoader(root);
                        loader.load(recordBatch);
                    }
                    System.out.println(root.contentToTSVString());
                }
                current.close();
            });
        }
    }
}
age
1
2
3
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Aug 26 '22 at 06:23