Yes, a variant of this is possible using the RecordBatchReader and RecordBatchWriter IPC primitives in both pyarrow and ArrowJS.
On the python side, you can serialize a Table to a buffer like this:
import pyarrow as pa
def serialize_table(table):
sink = pa.BufferOutputStream()
writer = pa.RecordBatchStreamWriter(sink, table.schema)
writer.write_table(table)
writer.close()
return sink.getvalue().to_pybytes()
# ...later, in your route handler:
bytes = serialize_table(create_your_arrow_table())
Then you can send the bytes in the response body. If you have multiple tables, you can concatenate the buffers from each as one large payload.
I'm not sure what functionality exists to write multipart/form-body responses in python, but that's probably the best way to craft the response if you want the tables to be sent with their names (or any other metadata you wish to include).
On the JavaScript side, you can read the the response either with Table.from()
(if you have just one table), or the RecordBatchReader
if you have more than one, or if you want to read each RecordBatch in a streaming fashion:
import { Table, RecordBatchReader } from 'apache-arrow'
// easy if you want to read the first (or only) table in the response
const table = await Table.from(fetch('/table'))
// or for mutliple tables on the same stream, or to read in a streaming fashion:
for await (const reader of RecordBatchReader.readAll(fetch('/table'))) {
// Buffer all batches into a table
const table = await Table.from(reader)
// Or process each batch as it's downloaded
for await (const batch of reader) {
}
}
You can see more examples of this in our tests for ArrowJS here:
https://github.com/apache/arrow/blob/3eb07b7ed173e2ecf41d689b0780dd103df63a00/js/test/unit/ipc/writer/stream-writer-tests.ts#L40
You can also see some examples in a little fastify plugin I wrote for consuming and producing Arrow payloads in node: https://github.com/trxcllnt/fastify-arrow