0

There is a huge ETL process on snowflake side that updates a table. I need to stream the changes made to it to other consumers-processors outside of snowflake. Not querying the table for updates but streaming (push pattern, not pull).

I see there is a kafka connector and examples to stream data into snowflake. Is there a way to stream it outside of it? Maybe using CDC + Streams + Tasks to some queue?

Andrey Borisko
  • 4,511
  • 2
  • 22
  • 31
  • 1
    Hi Andrey, there isn't any external streaming capability that I am aware of (yet) but perhaps you could create a STREAM on the Snowflake table in question. Have a task that runs (every minute?) to flush these changes out to JSON/CSV on S3 (or equivalent) and have the external app/service consume from there? Info on streams here: https://docs.snowflake.net/manuals/user-guide/streams.html – Mike Donovan Feb 19 '20 at 22:31
  • hey @MikeDonovan thanks for this idea. So, probably [Amazon S3 Source Connector](https://docs.confluent.io/current/connect/kafka-connect-s3-source/index.html#quick-start) or [FileSystem Connector](https://kafka-connect-fs.readthedocs.io/en/latest/index.html) could help after saving to s3. I'll check those in more details. – Andrey Borisko Feb 19 '20 at 22:40
  • 1
    no problem - and yes, those connectors look like excellent options for reading the changes back into Kafka topic(s). The Snowflake docs are excellent on these topics but this blog post you may also find helpful: https://community.snowflake.com/s/article/ELT-Data-Pipelining-in-Snowflake-Data-Warehouse-using-Streams-and-Tasks – Mike Donovan Feb 19 '20 at 23:00

0 Answers0