Reading a Delta Table with no Manifest File using Redshift

Question

My goal is to read a Delta Table on AWS S3 using Redshift. I've read through the Redshift Spectrum to Delta Lake Integration and noticed that it mentions to generate a manifest using Apache Spark using:

GENERATE symlink_format_manifest FOR TABLE delta.`<path-to-delta-table>`

or

DeltaTable deltaTable = DeltaTable.forPath(<path-to-delta-table>);
deltaTable.generate("symlink_format_manifest");

However, there doesn't seem to be support to generate these manifest files for Apache Flink and the respective Delta Standalone Library that it uses. This is the underlying software that writes data to the Delta Table.

How can I either get around this limitation?

score 0 · Answer 1 · answered Dec 26 '22 at 08:16

This functionality seems to now be supported on AWS:

With today’s launch, Glue crawler is adding support for creating AWS Glue Data Catalog tables for native Delta Lake tables and does not require generating manifest files. This improves customer experience because now you don’t have to regenerate manifest files whenever a new partition becomes available or a table’s metadata changes.

https://aws.amazon.com/blogs/big-data/introducing-native-delta-lake-table-support-with-aws-glue-crawlers/

Reading a Delta Table with no Manifest File using Redshift

1 Answers1