Questions tagged [amazon-redshift]

Amazon Redshift is a petabyte-scale data warehousing service using existing business intelligence tools to analyze the data. Redshift is a column-oriented MPP database based on ParAccel and ParAccel was itself based on PostgreSQL.

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more. Redshift is a column-oriented database based on PostgreSQL 8.0.2

Source: Amazon Redshift

Although Redshift to some extent is based on PostgreSQL they are substantially different. Do not add the postgresql tag to questions involving Amazon Redshift

Related Tags

8534 questions
2
votes
1 answer

Are there any benefits to storing data in DynamoDB vs S3 for use with Redshift?

My particular scenario: Expecting to amass TBs or even PBs of JSON data entries which track price history for many items. New data will be written to the data store hundreds or even thousands of times per a day. This data will be analyzed by…
2
votes
3 answers

Copy data from DynamoDB into Redshift across two different AWS accounts?

For reasons beyond my control, I have the following: A table CustomerPhoneNumber in DynamoDB under one AWS account. A Redshift cluster under a different AWS account (same geographic region; EU) Is there any way to run the COPY command to move data…
Ray
  • 3,137
  • 8
  • 32
  • 59
2
votes
1 answer

"Error: invalid input syntax for integer:" when inserting NULL values for an SMALLINT column in a Redshift table?

I have this locally defined python function that works fine when inserting data into a redshift table: def _insert_data(table_name, values_list): insert_base_sql = f"INSERT INTO {table_name} VALUES" insert_sql = insert_base_sql +…
Scott Borden
  • 177
  • 1
  • 2
  • 14
2
votes
1 answer

How to make projections with future dates using Redshift

I currently have a table called quantities with the following data: +------+----------+----------+ | item | end_date | quantity | +------+----------+----------+ | 1 | 26/11/17 | 100 | +------+----------+----------+ | 2 | 28/11/17 | 300 …
Henry
  • 697
  • 1
  • 6
  • 16
2
votes
0 answers

Redshift - Load data which has newline in field

I am trying to load the data that includes a new line within a field: 001|myname|fav\ movie | myaddress| myphone| There is a blank line between fav\movie. I am loading the data with this command: COPY catdemo FROM 's3://tickit/catego.csv' IAM_ROLE…
Nirmal Prabhu
  • 23
  • 1
  • 5
2
votes
2 answers

Group contiguous blocks for aggregation in SQL (Redshift)

I've got a table like this: id time activity 1: 1 1 a 2: 1 2 a 3: 1 3 b 4: 1 4 b 5: 1 5 a 6: 2 1 a 7: 2 2 b 8: 2 3 b 9: 2 4 b 10: 2 5…
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
2
votes
0 answers

Redshift DELETE using slow Hash Join while equivalent SELECT uses Merge Join

We are using the recommended method defined here for performing "upserts": http://docs.aws.amazon.com/redshift/latest/dg/merge-replacing-existing-rows.html It is taking almost two minutes to load a file of just 150 rows. Almost all of this time is…
Jared Gommels
  • 421
  • 6
  • 13
2
votes
0 answers

Failed to run connection test on Redshift endpoint in AWS DMS service

I have created an Redshift endpoint in AWS DMS service. When I run the test connection I get the following error message: Error Details: [errType=CALL_SERVER_ERROR, status=0, errMessage=Failed executing command on Replication Server,…
Cyrus
  • 912
  • 2
  • 11
  • 21
2
votes
2 answers

using regular expressions in redshift

This query works in mysql but I am not sure how to write the same query in redshift / postgresql. update customer_Details set customer_No = NULL WHERE customer_No NOT REGEXP '^[[:digit:]]{12}$'
shantanuo
  • 31,689
  • 78
  • 245
  • 403
2
votes
2 answers

Filter data based on a condition in Redshift

I came across one more issue while resolving the previous problem: So, I have this data: For each route -> I want to get only those rows where ob exists in rb. Hence, this output: I know this also needs to worked through a temp table. Earlier I…
2
votes
2 answers

Extracting Time from Timestamp in SQL

I am using Redshift and am looking to extract the time from the timestamp. Here is the timestamp: 2017-10-31 23:30:00 and I would just like to get the time as 23:30:00 Is that possible?
user8659376
  • 369
  • 4
  • 8
  • 19
2
votes
1 answer

Getting 0 rows while querying external table in redshift

We created the schema as follows: create external schema spectrum from data catalog database 'test' iam_role 'arn:aws:iam::20XXXXXXXXXXX:role/athenaaccess' create external database if not exists; and table as follows: create external table…
2
votes
3 answers

Signature of the Redshift internal "identity" function

While working on a legacy Redshift database I discovered unfamiliar pattern for default identity values for an autoincrement column. E.g.: create table sometable (row_id bigint default "identity"(24078855, 0, '1,1'::text), ... And surprisingly I…
Boris Uvarov
  • 58
  • 1
  • 9
2
votes
2 answers

Redshift large 'in' clause best practices

We have a query in which a list of parameter values is provided in "IN" clause of the query. Some time back this query failed to execute as the size of data in "IN" clause got quite large and hence the resulting query exceeded the 16 MB limit of the…
2
votes
1 answer

Amazon Redshift Sum window function on group

Is there a way to use the sum window function to get the following results in green I can get the total by using the following, but its givings a runnning total, I am looking for a group total sum(groupvolume) OVER ( PARTITION BY geo_group ORDER…
warrior_z
  • 23
  • 1
  • 5
1 2 3
99
100