I want to load last 10 days data from amazon S3 into snowflake tables using snowsql. Say today is 10th may, then I want all the data that has been changed in the past 10 days using snow sql. I want to do it at once instead of querying for each day individually.
Asked
Active
Viewed 41 times
2
-
How are you directories/files structured in S3? How do you know what data has changed in the last 10 days? You can use the FILES or PATTERN parameter of the COPY INTO command to control which files you are loading – NickW May 11 '23 at 11:27
1 Answers
0
You can leverage the stage metadata that is stored in Snowflake. See the doc
Assume that you already configured the storage integration, stage with the file format, and you have a table with all columns required.
For simplicity of my example I assume that you have CSV files with two columns. Then your query will be like:
insert into test_table SELECT $1, $2 FROM @S3_STAGE where METADATA$FILE_LAST_MODIFIED > dateadd(day, -10, current_date());
Where -10 is how many days you want to go back.
IMPORTANT. It will not check for the file duplicates, and it is possible to load the data from the same file multiple times. To prevent that you can create a list of the files first using the command like:
select METADATA$FILENAME from @S3_STAGE where METADATA$FILE_LAST_MODIFIED > dateadd(day, -10, current_date());
and then provide the paths to the files to a COPY INTO
eg:
COPY INTO test_table FROM @S3_STAGE
FILES=('test1.csv', 'test2.csv');

aek
- 1,370
- 2
- 10
- 14