Questions tagged [amazon-redshift]

Amazon Redshift is a petabyte-scale data warehousing service using existing business intelligence tools to analyze the data. Redshift is a column-oriented MPP database based on ParAccel and ParAccel was itself based on PostgreSQL.

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more. Redshift is a column-oriented database based on PostgreSQL 8.0.2

Source: Amazon Redshift

Although Redshift to some extent is based on PostgreSQL they are substantially different. Do not add the postgresql tag to questions involving Amazon Redshift

Related Tags

8534 questions
2
votes
1 answer

Tracking the change of status of a user

I’m trying to write the logic for a Query, that will allow me to classify users activities: • The problem is a table that contain all users activities in slots of ~5 min (not all are exactly 5 min, some 3 min, others 4 min) and record the amount…
Michelle
  • 202
  • 2
  • 14
2
votes
2 answers

Removing the commas between objects in a JSON array

I am working on loading JSON data into Redshift, but for it to work the commas have to be removed between the objects. If I remove the commas then it works fine. Can someone tell me how to remove the commas between objects so that I can load it into…
Shruti Purushan
  • 103
  • 1
  • 2
  • 9
2
votes
1 answer

AWS Redshift columnar storage vs distribution style

I have been reviewing the AWS documentation, and cannot seem to understand how the distribution style works and how that data is stored on Redshift. I understand what a columnar storage database is, but when I read the documentation on the…
stochasticcrap
  • 350
  • 3
  • 16
2
votes
1 answer

Using Redshift's native 'AT TIMEZONE' through django query?

So I have this query which goes like query = MyModel.objects.filter(some_filter).filter(eventTime__date__gte=start_date).filter(eventTime__date__lte=end_date).... The redshift table I am connecting to has eventTime as UTC. It offers me to query in…
Sumit Sinha
  • 666
  • 3
  • 9
  • 18
2
votes
3 answers

"Invalid credentials" error when accessing Redshift from Python

I am trying to write a Python script to access Amazon Redshift to create a table in Redshift and copy data from S3 to the Redshift table. My code is: import psycopg2 import os #import pandas as pd import…
2
votes
2 answers

Oracle index to AWS Redshift Sortkey

I am new to Redhsift and migrting oracle to Redshift. One of the oracle tables have around 60 indexes. AWS recommends its a good practice to have around 6 compound sort keys. How would these 60 oracle indexes translate to Redhsift sort keys ? I…
pChidambaram
  • 33
  • 1
  • 5
2
votes
0 answers

How Can Connect SSRS to Amazon Redshift Database

Hi I have a Redshift Database and I like to use SSRS to develop my reports. But I'm little lost with the connection
2
votes
1 answer

How do I reload a table without deleting existing views?

I have a dataset that spans across many tables by date. table_name_YYYY_MM_DD There are many VIEWS created across these date range tables. However whenever I need to reload a table, I have to delete all these views to remove dependency…
ForeverConfused
  • 1,607
  • 3
  • 26
  • 41
2
votes
1 answer

Redshift regexp_substr

I want to replicate this regex pattern to regexp_substr. I want to capture the second group. '(\?)(.*?)(&|$)' I have tried this regexp(my_url, '\\?.*?&|$') And some similar variations of the above, but I have been getting the errror: ERROR: XX000:…
Hound
  • 932
  • 2
  • 17
  • 26
2
votes
1 answer

Does Redshift store timestamp of last accessed?

I'd like to know how frequently certain records in a table in Redshift are being accessed. My hunch is that a large number of records in my table are queried less than once a month. If this is the case, then perhaps I can remove these records to…
ilanman
  • 818
  • 7
  • 20
2
votes
1 answer

aws localstack redshift connection issue using psycopg2

Using localstack for mocking AWS services. Was trying to connect to local redshift instance using psycopg2. But connection timesout. Connection using boto3 is successfully done. client = boto3.client('redshift',…
shrishinde
  • 3,219
  • 1
  • 20
  • 31
2
votes
1 answer

Issues with postgres_operator in Airflow dag

I am currently using Airflow 1.8.2 to schedule some EMR tasks and then execute some long running queries on our Redshift cluster. For that purpose I am using the postgres_operator. The queries take about 30 minutes to run. However, once they are…
shomo
  • 23
  • 4
2
votes
4 answers

Create table as in Redshift defining primary key

I am trying to replicate a table using CTAS clause in redshift by additionally specifying a primary key to the table. Tried below syntax but no luck. However, I was able to specify DISTKEY/SORTKEY using the same syntax create table date_dim PRIMARY…
Abhi
  • 1,153
  • 1
  • 23
  • 38
2
votes
0 answers

Reading AVRO or JSON records in redshift

I am serializing and deserializing my data using Avro. I save my serialized data into S3. I am trying to read the data in s3 to redshift, but unable to read it. Tried With Avro Format S3 records -…
2
votes
2 answers

How to connect to Amazon redshift cluster from within my Amazon EC2 instance

I have a Redshift Cluster in my AWS account. I am able to connect to it in python and when I run the script locally, it runs perfectly fine: import psycopg2 con = psycopg2.connect(dbname='some_dbname',…
1 2 3
99
100