Using Amazon Redshift Spectrum, you can query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Redshift Spectrum queries employ massive parallelism to execute very fast against large datasets.Multiple clusters can concurrently query the same dataset in Amazon S3 without the need to make copies of the data for each cluster.
Questions tagged [amazon-redshift-spectrum]
291 questions
0
votes
1 answer
AWS Spectrum giving blank result for parquet files generated by AWS Glue
We are building a ETL with AWS Glue. And to optimise the query performance we are storing data in apache parquet. Once the data is saved on S3 in parquet format. We are using AWS Spectrum to query on that data.
We successfully tested the entire…

jimy
- 4,848
- 3
- 35
- 52
0
votes
2 answers
How to convert a varchar data type field to a timestamp with time zone type field in redshift?
I have a table where the timestamp is stored as a varchar. I need to convert it to timestamp with timezone but every time I get "Invalid Operation" error.
The format of the field is:
2017-10-30 10:12:34:154 +1100
I tried the following:
'2017-10-30…

Isha Garg
- 331
- 1
- 3
- 12
0
votes
2 answers
How can I use Psycopg2 to add Partition in Redshift Spectrum -
We have a Redshift Spectrum table built on top of S3 data - we are trying to automate the partition addition in this table - I can run the following ALTER statement in a redshift client or psql shell:
ALTER TABLE analytics_spectrum.page_view ADD…

Hussain Bohra
- 985
- 9
- 15
0
votes
1 answer
Execute COPY command on Redshift database from a Linux server outside AWS cluster
I want to load data into Redshift database from amazon S3 using 'COPY' command.But I want to execute it from a shell/perl script present in a Linux machine present outside AWS cluster.I wanted to know if there is any Redshift client that can be…

indranil
- 73
- 6
-1
votes
1 answer
AWS Redshift Spectrum not working with apache parquet files
following is my sample csv file.
id,name,gender
1,isuru,male
2,perera,male
3,kasun,male
4,ann,female
i converted above csv file into apache parquet using pandas library. following is my code.
import pandas as pd
df =…

WAEX
- 115
- 1
- 9
-1
votes
1 answer
Are there ways to find amount of data queried per lambda statement for AWS redshift?
I am trying to find the amount of data queried per statement from AWS Lambda on Redshift, but all I can find is amount of data queried per query ID. There are multiple lambdas which I am running but I can't seem to relate the lambdas to the query…

charlieMike
- 21
- 3