Questions tagged [amazon-redshift-spectrum]

Using Amazon Redshift Spectrum, you can query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Redshift Spectrum queries employ massive parallelism to execute very fast against large datasets.Multiple clusters can concurrently query the same dataset in Amazon S3 without the need to make copies of the data for each cluster.

291 questions
2
votes
1 answer

cannot create a view in redshift spectrum external schema

I am facing an issue in creating a view in an external schema on a spectrum external table. Below is the script I am using to create the view create or replace view external_schema.test_view as select id, name from external_schema.external_table…
2
votes
1 answer

How to copy AWS Glue table structure to AWS Redshift

I created new Database and the Table structure using AWS Glue without using crawler and can do the same thing, I mean create the table structure using crawler. That's not the problem, what I want is to create the same table structure in AWS Redshift…
2
votes
1 answer

AWS Redshift cross account access

Different teams own different datasets. The goal I want to accomplish is to be able to query different sources owned by different teams (AWS accounts). Account A - Redshift Account B - S3 Account C - My account From my account I would like…
2
votes
1 answer

How to check Redshift COPY command performance from AWS S3?

I'm working on an application wherein I'll be loading data into Redshift. I want to upload the files to S3 and use the COPY command to load the data into multiple tables. For every such iteration, I need to load the data into around 20 tables. I'm…
Underoos
  • 4,708
  • 8
  • 42
  • 85
2
votes
2 answers

How does unloading an empty table from redshift to s3 behaves?

If an empty table is unloaded from redshift to S3 using UNLOAD command, does it creates an empty file on S3 or does it not do anything. Earlier (few days back ) I unloaded using unload command command, it placed a 0 byte file on s3. But today it is…
tricoder
  • 95
  • 1
  • 8
2
votes
1 answer

Amazon EMR vs Amazon Redshift

For majority of use-cases, Spark transformations can be done on streaming data or bounded data (say from Amazon S3) using Amazon EMR, and then data can be written to S3 again with the transformed data. The transformations can also be achieved in…
2
votes
3 answers

AWS Redshift to S3 Parquet Files Using AWS Glue

We have a use case where we are processing the data in Redshift. But I want to create backup of these tables in S3, so that I can query these using Spectrum. For moving the tables from Redshift to S3 I am using a Glue ETL. I have created a…
Vimarsh
  • 95
  • 1
  • 11
2
votes
1 answer

Redshift Spectrum Performance vs Athena

I have a bucket in S3 with parquet files and partitioned by dates. With the following query: select count(1) from logs.logs_prod where partition_1 = '2019' and partition_2 = '03' Running that query in Athena directly, it executes in less than…
Leandro Barreto
  • 361
  • 2
  • 18
2
votes
1 answer

How to alter external table in Redshift Spectrum?

I want to add a partition of data to my external table, but I'm receiving the error: ALTER EXTERNAL TABLE cannot run inside a transaction block. I removed the BEGIN/END transaction but still the same error persists. I read on some forums that adding…
Plarent Haxhidauti
  • 275
  • 1
  • 4
  • 17
2
votes
2 answers

Using Redshift Spectrum to read the data in external table in AWS Redshift

I did the below in AWS Redshift cluster to read the Parquet file from S3. create external schema s3_external_schema from data catalog database 'dev' iam_role 'arn:aws:iam:::role/' create external database if not…
2
votes
2 answers

Drop all partitions from redshift for an external table

I am trying to drop all the partitions on an external table in a redshift cluster. I am unable to find an easy way to do it. I am currently doing this by running a dynamic query to select the dates from the table and concatenating it with the drop…
2
votes
4 answers

Redshift showing 0 rows for external table, though data is viewable in Athena

I created an external table in Redshift and then added some data to the specified S3 folder. I can view all the data perfectly in Athena, but I can't seem to query it from Redshift. What's weird is that select count(*) works, so that means it can…
2
votes
1 answer

Check "CONNECTION LIMIT" of user in Redshift

I have a user in Redshift with username as "redshift_x" and want to know the CONNECTION LIMIT which is currently set for this user. I have tried querying it using the below query: select * from pg_user where usename = 'redshift_x'; But this query…
2
votes
2 answers

S3 Query Exception (Fetch)

I have uploaded data from Redshift to S3 in Parquet format and created the data catalog in Glue. I have been able to query the table from Athena but when I create the external schema on Redshift and tried to query on the table I'm getting the below…
DJo
  • 2,133
  • 4
  • 30
  • 46
2
votes
1 answer

Spectrum ERROR: Failed to incorporate external table

Redshift Spectrum is giving the below error which executing the SELECT statements for the external table created. ERROR: Failed to incorporate external table "schmaname"."tablename" into local catalog. The external table has limited number of…
DJo
  • 2,133
  • 4
  • 30
  • 46