0

Is it possible to add zip support in COPY INTO statements? It is still most popular compression with best results (in my tests) and widely adopted as default by many vendors.

Alex O
  • 1
  • 1

1 Answers1

0

it would probably not be too hard to add a very limited form of .zip support to MonetDB. However, the ZIP file format has many features that the currently supported formats don't have, and being able to take advantage of those would require more work.

In particular, .zip files can contain multiple member files while the existing formats such as .gz, .lz4 and .xz always contain a single file. So, we would need to come up with a way to deal with .zip files that happen to hold multiple member files. For example,

  1. reject them, only allow single-member zip files, or
  2. always pick the first member. or
  3. always pick the last member, or
  4. add syntax to designate a specific member, for example ../path/to/archive.zip#member.csv.

Also, ZIP files can use many compression algorithms though in practice they typically use DEFLATE.

Do you think support for single-member DEFLATE-based .zip files would be sufficient for your use case? If so we'll consider adding it in a next release.

  • I think in context of loading from compressed file single-member zip files is right case. My example is to load exported table content that may comprise of multiple chunks of compressed data but each zip file has only one csv. – Alex O Aug 06 '20 at 01:03
  • And there's another complication for ZIP files that were created on a Mac. They usually contain a hidden folder `__MACOSX` which needs to be ignored :) – Crouching Kitten Aug 25 '20 at 22:32