Using Amazon Redshift Spectrum, you can query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Redshift Spectrum queries employ massive parallelism to execute very fast against large datasets.Multiple clusters can concurrently query the same dataset in Amazon S3 without the need to make copies of the data for each cluster.
Questions tagged [amazon-redshift-spectrum]
291 questions
5
votes
1 answer
Unload multiple files from Redshift to S3
Hi I am trying to unload multiple tables from Redshift to a particular S3 bucket getting below error:
psycopg2.InternalError: Specified unload destination on S3 is not empty. Consider using a different bucket / prefix, manually removing the target…

Chandana Puppy
- 133
- 1
- 9
5
votes
2 answers
Skipping header rows in AWS Redshift External Tables
I have a file in S3 with the following data:
name,age,gender
jill,30,f
jack,32,m
And a redshift external table to query that data using spectrum:
create external table spectrum.customers (
"name" varchar(50),
"age" int,
"gender" varchar(1))
row…

fez
- 1,726
- 3
- 21
- 31
4
votes
2 answers
Getting Spectrum Scan Error code 15007 on select query on redshift external table
I have created a external table in redshift spectrum.Upon running the select * from table_name, i am getting following error
SQL Error [XX000]: ERROR: Spectrum Scan Error
Detail:
-----------------------------------------------
error: …

nat
- 557
- 2
- 11
- 25
4
votes
1 answer
Grant only access to View in Redshift Spectrum
I created a simple view over an external table on Redshift Spectrum:
CREATE VIEW test_view AS (
SELECT *
FROM my_external_schema.my_table
WHERE my_field='x'
) WITH NO SCHEMA BINDING;
Reading the documentation, I see that is not possible to give…

Hyruma92
- 836
- 8
- 24
4
votes
2 answers
[XX000][500310] [Amazon](500310) Invalid operation: Parsed manifest is not a valid JSON object
I'm running a crawler over a folder containing several files with different schemas. I expect so to find a table for each file.
What happens is that in the Glue Catalogue I can actually see a table for each file, with its own schema. But when I try…

Vzzarr
- 4,600
- 2
- 43
- 80
4
votes
1 answer
Error trying to access Amazon Redshift external table
I have avro files in S3 which I want to be able to query via Redshift. Have used external tables with success in the past but only in parquet/JSON format so wondering whether I'm missing something with the data being in avro format maybe.
I set up…

nalwadi
- 41
- 1
- 2
4
votes
1 answer
Redshift spectrum : how to import only certain files
When using redshift spectrum, it seems you can only import data providing location until a folder, and it imports all the files inside the folder.
Is there a way to import import only one file from inside a folder with many files. When providing…

Kushal Singh
- 65
- 5
4
votes
0 answers
AWS Glue skipping folder
I have a process that stores data to S3, transforms the data and converts the data to Parquet, to be queried through Redshift Spectrum. I have a Glue crawler that crawls my dataset, and I use three partitions: year, month, day. All my files are…

Jørgen Frøland
- 364
- 3
- 13
4
votes
2 answers
How to generate 12 digit unique number in redshift?
I have 3 columns in a table i.e. email_id, rid, final_id.
Rules for rid and final_id:
If the email_id has a corresponding rid, use rid as the final_id.
If the email_id does not have a corresponding rid(i.e.rid is null), generate a unique 12 digit…
user8147906
4
votes
2 answers
Remove double quotes " while loading data to Amazon Redshift Spectrum
I want to load data to amazon redshift external table. Data is in CSV format and has quotes.
Do we have something like REMOVEQUOTES which we have in copy command for redshift external
tables. Also what are different options to load fixed length…

SauravT
- 43
- 1
- 1
- 3
4
votes
1 answer
AWS Redshift Spectrum - how to get the s3 filenames in the external table
I have external tables created in AWS spectrum to query the s3 data however i am not able to identify the filenames which the record belongs to(i have thousands of files under a bucket)
In AWS Athena we have a pseudo column "$PATH" which will…

Rajeev
- 1,031
- 2
- 13
- 25
3
votes
1 answer
AWS Redshift Spectrum when accessing files in S3 Glacier deep archive
We have set up AWS Redshift external table accessing S3 using Spectrum. Due to the huge data amount, we decided to change S3 storage class for files older than 30 days to storage class S3 Glacier Deep Archive using Lifecycle policy.
I couldn't find…

Edgars T.
- 947
- 8
- 14
3
votes
1 answer
Redshift-Postgres RDS federated query: Authentication method 10 not supported
VPC is configured, secret is in Secrets Manager with correct policy attached to Redshift cluster.
Created external schema using
CREATE EXTERNAL SCHEMA schema_ext
FROM POSTGRES
DATABASE 'db' SCHEMA 'schema'
URI…

Yauheni Khvainitski
- 111
- 10
3
votes
2 answers
How to query an array field (AWS Glue)?
I have a table in AWS Glue, and the crawler has defined one field as array.
The content is in S3 files that have a json format.
The table is TableA, and the field is members.
There are a lot of other fields such as strings, booleans, doubles, and…

mrc
- 2,845
- 8
- 39
- 73
3
votes
2 answers
"Spectrum nested query error" Redshift error
When I run this query in Redshift:
select sd.device_id
from devices.s_devices sd
left join devices.c_devices cd
on sd.device_id = cd.device_id
I get an error like this:
ERROR: Spectrum nested query error
DETAIL:
…

del
- 6,341
- 10
- 42
- 45