Questions tagged [amazon-redshift]

Amazon Redshift is a petabyte-scale data warehousing service using existing business intelligence tools to analyze the data. Redshift is a column-oriented MPP database based on ParAccel and ParAccel was itself based on PostgreSQL.

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more. Redshift is a column-oriented database based on PostgreSQL 8.0.2

Source: Amazon Redshift

Although Redshift to some extent is based on PostgreSQL they are substantially different. Do not add the postgresql tag to questions involving Amazon Redshift

Related Tags

8534 questions
39
votes
9 answers

How to add a sort key to an existing table in AWS Redshift

In AWS Redshift, I want to add a sort key to a table that is already created. Is there any command which can add a column and use it as sort key?
jpdave
  • 408
  • 1
  • 4
  • 7
38
votes
7 answers

How to write data to Redshift that is a result of a dataframe created in Python?

I have a dataframe in Python. Can I write this data to Redshift as a new table? I have successfully created a db connection to Redshift and am able to execute simple sql queries. Now I need to write a dataframe to it.
Sahil
  • 413
  • 2
  • 5
  • 8
38
votes
4 answers

JOIN (SELECT ... ) ue ON 1=1?

I am reading an SQL query in Redshift and can't understand the last part: ... LEFT JOIN (SELECT MIN(modified) AS first_modified FROM user) ue ON 1=1 What does ON 1=1 mean here?
kee
  • 10,969
  • 24
  • 107
  • 168
36
votes
1 answer

Difference between S3 and Redshift (AWS)

I am studying first time about Amazon Web Services. I want to know what is the difference or relation between Amazon s3 and Amazon Redshift.
shivesh verma
  • 381
  • 1
  • 3
  • 8
33
votes
2 answers

How to list all tables and their creators (or owners) in Redshift

I thought it is straightforward but I couldn't find a way to list all tables and their creators (or owners) in Redshift. Any help/insight is welcome.
kee
  • 10,969
  • 24
  • 107
  • 168
32
votes
3 answers

How to create an Index in Amazon Redshift

I'm trying to create indexes in Amazon Redshift but I received an error create index on session_log(UserId); UserId is an integer field.
user3600910
  • 2,839
  • 4
  • 22
  • 36
31
votes
3 answers

Redshift: How to list all users in a group

Getting the list of users belonging to a group in Redshift seems to be a fairly common task but I don't know how to interpret BLOB in grolist field. I am literally getting "BLOB" in grolist field from TeamSQL. Not so sure this is specific to TeamSQL…
kee
  • 10,969
  • 24
  • 107
  • 168
31
votes
5 answers

Athena vs Redshift Spectrum

I am kind of evaluating Athena & Redshift Spectrum. Both serve the same purpose, Spectrum needs a Redshift cluster in place whereas Athena is pure serverless. Athena uses Presto and Spectrum uses its Redshift's engine Are there any specific…
31
votes
1 answer

Load CSV into Redshift, with header?

Is there an option to load a CSV into Redshift with a header? I see the documentation for CSV but it says nothing about a header. Ideally it could use the header to determine the columns to load.
Some Guy
  • 12,768
  • 22
  • 58
  • 86
31
votes
2 answers

How to GROUP BY and CONCATENATE fields in redshift

How to GROUP BY and CONCATENATE fields in Redshift e.g. If I have table ID COMPANY_ID EMPLOYEE 1 1 Anna 2 1 Bill 3 2 Carol 4 2 Dave How can I get result like this COMPANY_ID EMPLOYEE 1 …
spats
  • 805
  • 1
  • 10
  • 12
30
votes
8 answers

Deleting duplicates rows from redshift

I am trying to delete some duplicate data in my redshift table. Below is my query:- With duplicates As (Select *, ROW_NUMBER() Over (PARTITION by record_indicator Order by record_indicator) as Duplicate From table_name) delete from duplicates Where…
Neil
  • 1,715
  • 6
  • 30
  • 45
30
votes
1 answer

Why "||" is used as string concatenation in PostgreSQL/Redshift

I find this really weird. If we will look at the major programming languages they all use "||" as logical "or" operator. Is there any (maybe historical) reason why "||" is living in PostgreSQL along with CONCAT() function?
Arius
  • 1,387
  • 1
  • 11
  • 24
29
votes
9 answers

Unloading from redshift to s3 with headers

I already know how to unload a file from redshift into s3 as one file. I need to know how to unload with the column headers. Can anyone please help or give me a clue? I don't want to manually have to do it in shell or python.
Tokunbo Hiamang
  • 429
  • 2
  • 6
  • 11
29
votes
3 answers

What does it mean to have multiple sortkey columns?

Redshift allows designating multiple columns as SORTKEY columns, but most of the best-practices documentation is written as if there were only a single SORTKEY. If I create a table with SORTKEY (COL1, COL2), does that mean that all columns are…
Lorrin
  • 1,799
  • 1
  • 16
  • 21
27
votes
9 answers

Using psycopg2 with Lambda to Update Redshift (Python)

I am attempting to update Redshift from a Lambda function using python. To do this, I am attempting to combine 2 code fragments. Both fragments are functional when I run them separately. Updating Redshift from PyDev for Eclipse import…