20

Is it possible to perform distributed concurrent writes to parquet format?

And is it possible to read parquet files while they are being written?

If there are methods for concurrent read/writes I'd be interested to learn about.

Seanny123
  • 8,776
  • 13
  • 68
  • 124
Loic
  • 1,088
  • 7
  • 19

1 Answers1

21

I eventually had an answer from Parquet developers: answer is no to both questions:

Parquet writers are not thread-safe and files cannot be read or written by different readers or writers concurrently. Parquet doesn't expose flush/sync operations to the user (for good reason) so there isn't a way to reliably do this anyway.

Loic
  • 1,088
  • 7
  • 19