0

I am reading from S3 folder in snowflake via stage. The bucket in s3 have multiple folder (or object if we want to call it). The folder is on date basis in the bucket

date=2020-06-01
date=2020-06-02
date=2020-06-03
date=2020-06-04
date=2020-06-05

I am using the below query to read all the folder at once. which is just working fine.

select raw.$1:name name,
       raw.$1:id ID
from
@My_Bucket/student_date/
(FILE_FORMAT => PARQUET, 
 PATTERN =>'.*date=.*\gz.parquet') raw
;   

Now i want to select the folder name as well in my query, Is there a way to do it.

like the output to contain

name | id | date..

pleas suggest

danD
  • 666
  • 1
  • 7
  • 29

1 Answers1

4

Snowflake has a built-in metadata field that provides the full filename, including the path. You should be able to run the following query:

select raw.$1:name name,
       raw.$1:id ID,
       METADATA$FILENAME 
from
@My_Bucket/student_date/
(FILE_FORMAT => PARQUET, 
 PATTERN =>'.*date=.*\gz.parquet') raw
; 

I know you are after the date portion only, but once you have the filename, you can use the SPLIT_PART function to get the date part from the filename. e.g.

SPLIT_PART(METADATA$FILENAME, '/', 4)

Hope this helps.

Marius
  • 208
  • 5
  • 14
  • who is he, who is he?.... he is the angel sent by god himself.... thanks a lot my friend – danD Jun 19 '20 at 16:58