Questions tagged [amazon-redshift-spectrum]

Using Amazon Redshift Spectrum, you can query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Redshift Spectrum queries employ massive parallelism to execute very fast against large datasets.Multiple clusters can concurrently query the same dataset in Amazon S3 without the need to make copies of the data for each cluster.

291 questions

votes

0 answers

Unloading & reloading data between S3 and Redshift with schema changes

I'm interested in setting up some automated jobs that will periodically export data from our Redshift instance and store it on S3, where ideally it will then be bubbled back up into Redshift via an external table running in Redshift Spectrum. One…

asked Nov 07 '18 at 18:56

Cody Kestigian

votes

1 answer

Redshift Spectrum Query - Request ran out of memory in the S3 query layer

I am trying to execute a query with grouping on 26 columns. Data is stored in S3 in parquet format partitioned by day. Redshift Spectrum query is returning below error. I am not able to find any relevant documentation in aws regarding this. Request…

sql amazon-web-services amazon-redshift amazon-redshift-spectrum

asked Oct 18 '18 at 06:01

conetfun

1,605
4
17
38

votes

1 answer

How to identify a person or id if it contains more than one row for a different column in SQL

I have a table in which a person contains same values multiple times in another column. For example: person product portal count indicator ----------------------------------------------- 1 10 5 2 y …

sql-server oracle amazon-redshift-spectrum

asked Aug 05 '18 at 09:37

Shivam Tyagi

votes

1 answer

AWS Glue: How to ETL non-scalar JSON with varying schemas

Objective I have an S3 folder full of json files with varying schemas, including arrays (a dynamodb backup, as it happens). However, while the schemas vary, all files contain some common elements, such as 'id' or 'name', as well as nested arrays of…

amazon-web-services amazon-s3 amazon-dynamodb aws-glue amazon-redshift-spectrum

asked Jun 26 '18 at 07:12

spinnn

votes

1 answer

Spectrum Same External Table Shows in Multiple Schemas (svv_external_tables)

It's a really simple test actually. I create a couple external schemas and create an external table in one of the schemas and then querying svv_external_tables shows the table exists in ALL schemas!! What am I missing? create external schema…

external-tables amazon-redshift-spectrum

asked Jun 14 '18 at 16:05

Robin Tanner

votes

1 answer

how to view data catalog table in S3 using redshift spectrum

I created external schema for my database in aws glue. I can see the list of table but I cannot look into the json data. redshift throws me this errors. [Amazon](500310) Invalid operation: S3 Query Exception (Fetch) Details: …

amazon-redshift aws-glue amazon-redshift-spectrum

asked Jun 05 '18 at 05:10

beni

votes

0 answers

Column names containing dots in Spectrum

I created a customers table with columns has account_id.cust_id, account_id.ord_id and so on. My create external table query was as follows: CREATE EXTERNAL TABLE spectrum.customers ( "account_id.cust_id" numeric, "account_id.ord_id" numeric ) row…

amazon-web-services amazon-redshift amazon-redshift-spectrum

asked Feb 23 '18 at 15:31

Prajakta Yerpude

votes

2 answers

Presto equivalent for Redshift's PERCENTILE_DISC

Given a query below in Redshift: select distinct cast(joinstart_ev_timestamp as date) as session_date, PERCENTILE_DISC(0.02) WITHIN GROUP (ORDER BY join_time) over(partition by trunc(joinstart_ev_timestamp))/1000 as mini, median(join_time)…

mysql amazon-redshift presto amazon-redshift-spectrum

asked Feb 13 '18 at 06:54

Bhuvi007

votes

1 answer

Can I convert CSV files sitting on Amazon S3 to Parquet format using Athena and without using Amazon EMR

I would like to convert the csv data files that are right now sitting on Amazon S3 into Parquet format using Amazon Athena and push them back to Amazon S3 without taking any help from Amazon EMR. Is this possible to do it? Has anyone experienced…

amazon-web-services amazon-s3 amazon-redshift amazon-emr amazon-redshift-spectrum

asked Feb 08 '18 at 21:16

Teja

13,214
36
93
155

votes

1 answer

How to create an external table for nested Parquet type in redshift spectrum

I know redshift and redshift spectrum doesn't support nested type, but I want to know is there any trick that we can bypass that limitation and query our nested data in S3 with Redshift Spectrum? In this post the guy shows how we can do it for JSON…

amazon-web-services parquet amazon-redshift-spectrum

asked Feb 06 '18 at 15:02

Am1rr3zA

7,115
18
83
125

votes

1 answer

How to load CDC into Redshift database?

Can anyone tell me CDC /incremental load methods in Redshift using SQL? I know one method upsert but other than this there are another methods to do like insert followed by delete etc..

amazon-redshift amazon-redshift-spectrum

asked Jan 19 '18 at 10:08

mounika nerella

votes

1 answer

Cannot connect to aws redshift

I created a redshift in aws console. the I went to cluster created and based on the information I got in the console I used them in SQL Workbench/J. To set up sql workbench/J I used the…

amazon-web-services amazon-redshift sql-workbench-j amazon-redshift-spectrum

asked Jan 18 '18 at 19:20

Hamed Minaee

2,480
4
35
63

votes

2 answers

Spectrum in us-west-1 and Glue in us-west-2 is it possible?

I am using the Redshift Cluster in us-west-1 (NCAL) s3 file location is in us-west-1 (NCAL) Glue data catalog is in us-west-2 (Oregon) When I try to query the table select count(*) from spectrum_schema.table_name; I get the error below. [Code:…

amazon-s3 amazon-redshift aws-glue amazon-redshift-spectrum

asked Jan 10 '18 at 23:00

Kunal

votes

0 answers

How to specify row delimiter for Redshift Spectrum

I'm trying to mount csv files that have a CRLF as a row terminator, into Redshift Spectrum. However, it seems like I can only specify a single character as a row terminator. Does anyone know how to get around this?

amazon-redshift delimiter amazon-redshift-spectrum

asked Dec 18 '17 at 20:27

user1316437

votes

0 answers

data distribution in redshift for star schema model?

I have big fact table 2 billions rows and 19 dimensions ( product dimension is big 450 millions, another two dimensions are 100 millions each rest small dimensions table) Can some one help me on data distribution for this scenarios ?

amazon-web-services amazon-s3 amazon-redshift amazon-redshift-spectrum

asked Dec 18 '17 at 09:55

diptiranjan pradhan

Prev 1 2 3

…

20 Next