1

When we run a "COPY INTO from AWS S3 Location" command, does the data-files physically get copied from S3 to EC2-VM-Storage (SSD/Ram)? Or does the data still reside on S3 and get converted to Snowflake format?

And, if I run copy Into and then suspend the warehouse, would I lose data on resumption?

Please let me know if you need any other information.

2 Answers2

2

The data is loaded onto Snowflake tables from an external location like S3. The files would still be there on S3 and if there is the requirement to remove these files post copy operation then one can use "PURGE=TRUE" parameter along with "COPY INTO" command.

The files as such will be on the S3 location, the values from it is copied to the tables in Snowflake.

Warehouse operations that are running are not affected even if the WH is shut down and is allowed to complete. So, there is no data loss in the event.

Srinath Menon
  • 1,479
  • 8
  • 11
  • Thanks for you response Srinath. I am still not clear as to where the data gets copied to? Does the data gets physically copied to the internal storage of the EC2 machines that creates the cluster? Or is it that physically data still remains in another s3 location and only gets converted to snowflake format. – Anish Kumar Dec 23 '21 at 10:13
  • The data gets physically copied to SF side and the file is retained on the external stage. Then to purge this file from external storage, as mentioned earlier, you may use PURGE=true parameter with COPY INTO command. – Srinath Menon Dec 23 '21 at 11:22
1

When we run a "COPY INTO from AWS S3 Location" command, Snowflake copies data file from your S3 location to Snowflake S3 storage. Snowflake S3 location is only accessible by querying the table, in which you have loaded the data.

When you suspend a warehouse, Snowflake immediately shuts down all idle compute resources for the warehouse, but allows any compute resources that are executing statements to continue until the statements complete, at which time the resources are shut down and the status of the warehouse changes to “Suspended”. Compute resources waiting to shut down are considered to be in “quiesce” mode.

More details: https://docs.snowflake.com/en/user-guide/warehouses-tasks.html#suspending-a-warehouse

Details on the loading mechanism you are using are in docs: https://docs.snowflake.com/en/user-guide/data-load-s3.html#bulk-loading-from-amazon-s3

FKayani
  • 981
  • 1
  • 5
  • 10