0

I have keys in the following format:

s3://bucket/source/2019/01/01/xyz_20190101.csv s3://bucket/source/2019/01/01/mno_20190101.csv s3://bucket/source/2019/01/02/xyz_20190102.csv s3://bucket/source/2019/01/02/mno_20190102.csv

But when i add the partition using the following command both xyz_20190101.csv and mno_20190101.csv gets loaded into the external table but I only want xyz to be loaded:

alter table database_x.schema_y.table_z
add partition(year='2019', month='01', day='01')
location 's3://bucket/source/2019/01/01/';

Is there any way to let redshift spectrum to only load the desired key with prefix and not all the keys?

Update:

My create query is:

create external table database_x.schema_y.table_z(
col1 BIGINT,
col2 BIGINT,
col3 VARCHAR(80),
col4 DATE
)
partitioned by (year char(4), month char(2), day char(2))
row format delimited
fields terminated by '|'
location 's3://bucket/source/'
table properties ('skip.header.line.count'='1', 'has_encrypted_data'='false');

And then I run the aforementioned alter table to add the partition.

Gagan
  • 1,775
  • 5
  • 31
  • 59
  • You you tried using `location 's3://bucket/source/2019/01/01/xyz_20190101.csv`? It is just a prefix, which would probably allow the full name. Or even `location 's3://bucket/source/2019/01/01/xyz`. – John Rotenstein Jul 17 '19 at 07:55
  • @JohnRotenstein When i use full filename It gives an error saying: Parsed manifest is not a valid JSON object. And when I used 's3://bucket/source/2019/01/01/xyz' it just doesn't load the data into the table. I am posting my create table query in the question to you to have a look. – Gagan Jul 18 '19 at 00:02

0 Answers0