Questions tagged [amazon-redshift-spectrum]

Using Amazon Redshift Spectrum, you can query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Redshift Spectrum queries employ massive parallelism to execute very fast against large datasets.Multiple clusters can concurrently query the same dataset in Amazon S3 without the need to make copies of the data for each cluster.

291 questions
0
votes
2 answers

How to automate Redshift snapshot creation and resume cluster from snapshot at a particular time?

I want some solution where a CloudWatch rule triggers the lambda function that takes a snapshot and shutdown the cluster at the given time, and resume the cluster from the created snapshot at another time. This way a lot money can be saved. As of…
0
votes
1 answer

DatetimeParseError in Redshift query using Node.js

I'd been trying to get this API to work. I'm successfully able to pass values for companyID, but the same method is not working for dateID. I have tried declaring it as both: var dateID = 20190610; var dateID = "20190610"; And in the query I have…
0
votes
1 answer

How to join 2 tabels in order to receive all the data needed

I have 2 queries: /*+ ETLM { depend:{ replace:[ { name:"table_1" } ] } } */ SELECT case_id, x, x, x, x, x FROM table.1 WHERE resolved_date between TO_DATE ('2020/01/01', 'YYYY/MM/DD') and…
Catalin
  • 282
  • 1
  • 14
0
votes
1 answer

Spectrum table Manifest file when S3 file size is in decimal

I am reading a S3 file by creating a Spectrum external table and pointing it to a manifest file which contains the information about the source S3 file. The problem is when my S3 file size is in decimal for e.g. 37.5 MB or 100.2 KB. As per the…
0
votes
2 answers

Is there a way to check multiple columns using "IN" condition in Redshift Spectrum?

I have a Redshift Spectrum table named as customer_details_table where the column id is not unique. I have another column hierarchy which is based on which record should be given priority if they have the same id. Here's an example: Here, if we…
0
votes
1 answer

How to update table's column value in Redshift based on a join?

How can I update this table with this value in Redshift: UPDATE t1 SET col1 = 'new_value_here' FROM t1 LEFT JOIN t2 on t1.col2 = t2.col2 WHERE t1.country IN ('USA', 'JAPAN') AND t1.col1 = 'old_value_here' AND t2.col2 IS NULL; I…
ZelelB
  • 1,836
  • 7
  • 45
  • 71
0
votes
1 answer

Minimum permission required to access Redshift External table

As per the AWS documentation, To run a Redshift Spectrum query, you need the following permissions: Usage permission on the schema Permission to create temporary tables in the current database I have an External database, schema and a table…
SwapSays
  • 407
  • 7
  • 18
0
votes
1 answer

Is there a threshold or use case that would push an implementation from AWS Athena to Redshift Spectrum?

I've seen lots of blogs and posts comparing AWS Athena and Redshift Spectrum. The unanimous consensus seems to be that if you don't already have a Redshift implementation, just go with Athena. Are there any scenarios or thresholds where Redshift…
Josh Russo
  • 3,080
  • 2
  • 41
  • 62
0
votes
1 answer

Redshift: Convert text to timestamp

I have a score column with json value - {"Choices":null, "timestamp":"1579650266955", "scaledScore":null} I am using the below sql to retrive the timestamp value -- select json_extract_path_text(score, 'timestamp') from schema.table limit 10; Now…
Anand
  • 145
  • 1
  • 3
  • 10
0
votes
1 answer

AWS Quicksight, Redshift "A subquery that refers to a nested table cannot contain WINDOW operation"

The error message is: sourceErrorCode: 500310 sourceErrorMessage: [Amazon](500310) Invalid operation: Spectrum nested query error Details: ----------------------------------------------- error: Spectrum nested query error code: 8001…
0
votes
2 answers

How do you connect to an external schema/table on Redshift Spectrum through AWS Quicksight?

I have spun up a Redshift cluster and added my S3 external schema by running CREATE EXTERNAL SCHEMA s3 FROM DATA CATALOG DATABASE '' IAM_ROLE ''; to access the AWS Glue Data Catalog. Everything is fine on…
0
votes
0 answers

Getting Invalid digit, Value '_', Pos 2, Type: Integer while running select query

all I have an external spectrum table in redshift and it contains almost 600 columns when I try to run SELECT query for the spectrum table it is giving me an error "Invalid digit, Value '_', Pos 2, Type: Integer" I don't know how to solve it. If…
Dusky Dood
  • 197
  • 3
  • 13
0
votes
1 answer

What is the data format for a file to be read by Redshift Spectrum?

I've been reading up on Redshift Spectrum and there are a few things I just don't understand. I understand that Redshift Spectrum will read data from files stored in S3, but what is the actual file I need to store in S3? Is it some SQL…
0
votes
0 answers

AWS Spectrum vs Athena proper JSON format for multi row data

Hey I am trying to ingest/query some JSON data using AWS Spectrum. I have created a json which format looks like (every row in a single line): {"name": "name1", "attr":"someval"}, {"name": "name2", "attr":"someval2"} It is not a valid JSON format,…
flowoo
  • 367
  • 3
  • 13
0
votes
1 answer

Exclude column in redshift spectrum sql queries

In my table having columns col1,col2.....coln I want to select all columns except col1 instead of writing select col2,col3.... coln from I can specify select * from except col1 Select all column excluding one column