Questions tagged [amazon-redshift]

Amazon Redshift is a petabyte-scale data warehousing service using existing business intelligence tools to analyze the data. Redshift is a column-oriented MPP database based on ParAccel and ParAccel was itself based on PostgreSQL.

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more. Redshift is a column-oriented database based on PostgreSQL 8.0.2

Source: Amazon Redshift

Although Redshift to some extent is based on PostgreSQL they are substantially different. Do not add the postgresql tag to questions involving Amazon Redshift

Related Tags

8534 questions
2
votes
2 answers

SQL - How to combine 2 dates from different tables with joins without error

Im my company, there are 2 tables to carry the campaign information and the webshop information. Basic information In the campaign table it carries the information like as follows: CAMPAIGN_NAME CREATION_DATE NUM_DELIVERED NUM_ERRORS Promotion…
Pak Hang Leung
  • 389
  • 5
  • 15
2
votes
1 answer

Use schema name in a JOIN in Redshift

Our database is set up so that each of our clients is hosted in a separate schema (the organizational level above a table in Postgres/Redshift, not the database structure definition). We have a table in the public schema that has metadata about our…
R. Jutras
  • 331
  • 1
  • 4
  • 14
2
votes
1 answer

Redshift select random records but avoid duplicate

I have a table in Redshift where I have following records for a sample ID 71082: id trm_num start_time 71082 PCMAMGA759551 2012-05-02 09:41:54 71082 PCMAMGA759551 2015-06-02 13:23:39 71082 PCMAMGA759551 2015-09-03…
AKSHAY SHINGOTE
  • 407
  • 1
  • 8
  • 22
2
votes
1 answer

Copy data from Amazon s3 to redshift

I'm trying to copy data from S3 bucket to Redshift Database using airflow, here is my code: from airflow.hooks import PostgresHook path = 's3://my_bucket/my_file.csv' redshift_hook = PostgresHook(postgres_conn_id='table_name') access_key='abcd'…
kab
  • 195
  • 2
  • 5
  • 12
2
votes
0 answers

Kafka Connect from RDS to RedShift not starting

I was able to implement Kafka Connect on a much smaller table but am trying to implement it on a larger database. My source and sink configuration are as…
Minh
  • 2,180
  • 5
  • 23
  • 50
2
votes
2 answers

How to upload my csv file into Redshift/SQL?

I have a large CSV file that I need to get into Redshift. It has ~5 million rows. A couple issues: 1) The file's first 10 lines are gibberish that I want deleted/excluded 2) Whenever I try to upload csv files, I always get this weird glitch where it…
jc315
  • 51
  • 6
2
votes
0 answers

Error: _get_column_info() When uploading Pandas DataFrame to Redshift

I'm trying to upload a pandas DataFrame directly to Redshift using the to_sql function. connstr = 'redshift+psycopg2://%s:%s@%s.redshift.amazonaws.com:%s/%s' % (username, password, cluster, port, db_name) def send_data(df,…
Bill
  • 698
  • 1
  • 5
  • 22
2
votes
1 answer

Run Redshift Queries Periodically

I have started researching into Redshift. It is defined as a "Database" service in AWS. From what I have learnt so far, we can create tables and ingest data from S3 or from external sources like Hive into Redhshift database (cluster). Also, we can…
2
votes
1 answer

aws lambda error loading redshift jdbc driver

I get below error when trying to load Redshift jdbc jar from Aws Lambda. java.io.IOException: Unable to load driver: JAR expected but not found. java.sql.SQLException: No suitable driver found for …
Avinash
  • 21
  • 2
2
votes
0 answers

Issue with copying data from s3 to Redshift

I am trying to sync a table from MySQL RDS to redshift trough data pipeline. There was no issue in copying data frm RDS to S3. But while copying S3 to redhsift the follwoing isue is seen. amazonaws.datapipeline.taskrunner.TaskExecutionException:…
2
votes
1 answer

AWS Glue: Redshift Upsert

After doing a bit of research, I see that since Redshift doesn't support merge/upsert some people are using staging tables to update/insert records. Since Redshift also doesn't support procedures (triggers, etc.) does anyone have suggestions for how…
2
votes
1 answer

Redshift LIKE column value with %

I have a column that is a comma separated string of values. I want to join another table that only has one of the values. On redshift, how can I do a LIKE operator with '%' injected into the comparison? Ex: TableA: values_col = 'abc, def' TableB:…
cvax
  • 416
  • 1
  • 5
  • 12
2
votes
5 answers

Join on group id, to get missing individual ids in that table

For the following data sets (super simplified): t1 ext_id, tid, aid, aum, actions z1, 1, a, 100, 100 z2, 1, b, 100, 100 x1, 2, d, 200, 200 x2, 2, e, 200, 200 t2 tid, aid, aum, actions 1, a, 100, 100 1, b, …
user8834780
  • 1,620
  • 3
  • 21
  • 48
2
votes
1 answer

What does VARCHAR(20*4) mean?

Referring to this link: https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-template-redshift.html They list some mappings as: VARCHAR(20*4) VARCHAR(size*4) I assume they didn't mean VARCHAR(80) for 20*4 or they would have put…
john
  • 33,520
  • 12
  • 45
  • 62
2
votes
1 answer

Aggregate case when inside non aggregate query

I have a pretty massive query that in its simplest form looks like this: select r.rep_id, u.user_id, u.signup_date, pi.application_date, pi.management_date, aum from table1 r left join table2 u on r.user_id=u.user_id left join table3 pi on…
user8834780
  • 1,620
  • 3
  • 21
  • 48