S3 dynamic csv file into Snowflake

Question

I possess a dataset stored in Amazon S3 that I desire to ingest into Snowflake using IICS (Informatica). This dataset is formatted as a CSV file, which corresponds to the structure of the tables I aim to create or update in Snowflake. However, the complexity lies in the fact that the structure of these CSV files may evolve over time, with new columns being introduced. What strategy can I employ to seamlessly handle these dynamic changes while ingesting data into Snowflake?

score 0 · Answer 1 · answered May 20 '23 at 08:25

0

You could use Snowpark or Python (via an orchestrator such as Airflow), read the schema of the CSV file and recreate the table if there is a change.

answered May 20 '23 at 08:25

Priyadarshan Mohanty

137
2
6

Jatin Morar · Answer 2 · 2023-05-20T18:04:01.373

I would recommend using a copy command to a staging environment in snowflake. If you really want to use IICS as a scheduler/orchestrator then invoke the command through it.

https://docs.snowflake.com/en/user-guide/data-load-s3-copy

Handle your CSV changes within snowflake itself. Using IICS to handle flat file changes overtime is not worth it, IMO.

If you can work around IICS, you could also create a seamless pipeline via SQS, s3, and snowpipe, with a subsequent handling of the transform in snowflake.

S3 dynamic csv file into Snowflake

2 Answers2