2

I'm using file systems (local, Google Cloud Storage, and maybe S3) to exchange data between the web front end (JS) and back end (Python).

After writing Arrow IPC data format to file systems using the Python back end like below:

with self.file_system.open_output_stream(f'{self.bucket}/{file_name}') as sink:
    with ipc.new_file(sink, schema=table.schema, options=ipc.IpcWriteOptions(compression='lz4')) as writer:
        writer.write(table)
    sink.flush()

the front-end JS package apache-arrow doesn't read the LZ4 or ZSTD compressed files:

import { readFileSync } from 'fs';
import { tableFromIPC } from 'apache-arrow';

const arrow = readFileSync('xx.arrow');
const table = tableFromIPC(arrow);

console.table(table.toArray());

I got this error:

Uncaught (in promise) Error: Record batch compression not implemented

I didn't find any clue from the JS package documentation which is its default reading format or how I could customize the options. Does anyone have an idea?

Yan Yang
  • 1,804
  • 2
  • 15
  • 37
  • It looks like arrow js doesn't support compression for IPC https://github.com/apache/arrow/blob/f5166fe21969d19adff23fc840ed1d7511348bad/js/src/ipc/metadata/message.ts#L299 Your best option is to disable it when writing (by changing`IpcWriteOptions`) – 0x26res Nov 21 '22 at 13:41

0 Answers0