Questions tagged [amazon-redshift]

Amazon Redshift is a petabyte-scale data warehousing service using existing business intelligence tools to analyze the data. Redshift is a column-oriented MPP database based on ParAccel and ParAccel was itself based on PostgreSQL.

Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more. Redshift is a column-oriented database based on PostgreSQL 8.0.2

Source: Amazon Redshift

Although Redshift to some extent is based on PostgreSQL they are substantially different. Do not add the postgresql tag to questions involving Amazon Redshift

Related Tags

8534 questions
27
votes
4 answers

Amazon Redshift Grants - New table can't be accessed even though user has grants to all tables in schema

I have a bit of a funny situation in Amazon Redshift where I have a user X who has grant select on all tables in schema public, but once a new table is created, this grant doesn't seem to apply to the new table. Is this normal behaviour? If yes, how…
elvikingo
  • 947
  • 1
  • 11
  • 20
27
votes
4 answers

How do I get table and columns information from Redshift?

pg_tables provides a list of tables. Is there a pg_columns or its equivalent to provide the list of columns? In DB2, I would query sysibm.systables/columns to get such information. What is the equivalent in redshift?
Prabhu M
  • 483
  • 2
  • 7
  • 9
26
votes
2 answers

Redshift division result does not include decimals

I'm trying to do something really quite basic to calculate a kind of percentage between two columns in Redshift. However, when I run the query with an example the result is simply zero because the decimals are not being covered. code: select 1701 /…
Andres Urrego Angel
  • 1,842
  • 7
  • 29
  • 55
26
votes
10 answers

Using sql function generate_series() in redshift

I'd like to use the generate series function in redshift, but have not been successful. The redshift documentation says it's not supported. The following code does work: select * from generate_series(1,10,1) outputs: 1 2 3 ... 10 I'd like to do…
Elm
  • 1,355
  • 6
  • 22
  • 33
25
votes
1 answer

skip bad record in redshift data load

I am trying to load data into AWS redshift using following command copy venue from 's3://mybucket/venue' credentials 'aws_access_key_id=;aws_secret_access_key=' delimiter '\t'; but data load is failing, when I…
roy
  • 6,344
  • 24
  • 92
  • 174
25
votes
5 answers

Loading data (incrementally) into Amazon Redshift, S3 vs DynamoDB vs Insert

I have a web app that needs to send reports on its usage, I want to use Amazon RedShift as a data warehouse for that purpose, How should i collect the data ? Every time, the user interact with my app, i want to report that.. so when should i write…
25
votes
5 answers

How to unload a table on RedShift to a single CSV file?

I want to migrate a table from Amazon RedShift to MySQL, but using "unload" will generate multiple data files which are hard to imported into MySQL directly. Is there any approach to unload the table to a single CSV file so that I can import it to…
ciphor
  • 8,018
  • 11
  • 53
  • 70
25
votes
3 answers

Connecting to a Redshift cluster from pgAdmin

UPDATE: also asked on the PgAdmin-support mailing list here. So I have an AWS Redshift cluster up and running, and I'm able to connect to it from the command line with $ psql -h host -d database -p port -U username I want to connect to the cluster…
Justin
  • 1,226
  • 4
  • 18
  • 21
24
votes
5 answers

AWS Glue: How to handle nested JSON with varying schemas

Objective: We're hoping to use the AWS Glue Data Catalog to create a single table for JSON data residing in an S3 bucket, which we would then query and parse via Redshift Spectrum. Background: The JSON data is from DynamoDB Streams and is deeply…
24
votes
11 answers

Alternative to BigQuery for medium-sized data

This is a follow-up to the question Why doesn't BigQuery perform as well on small data sets. Let's suppose I have a data-set that is ~1M rows. In the current database that we're using (mysql) aggregation queries would run quite slow, perhaps taking…
David542
  • 104,438
  • 178
  • 489
  • 842
24
votes
1 answer

permission denied to set parameter "client_min_messages" to "notice"

I have a redshift cluster launched and running on aws and the inbound query is authorized by configuring the VPC security group Then I try to connect to the redshift with pgAdmin and received following error An error has occurred: ERROR: permission…
Hello lad
  • 17,344
  • 46
  • 127
  • 200
24
votes
2 answers

redshift equivalent of TEXT data type

What's the best data type to use for a column in a redshift table that will hold a very long string (can be up to 50KB)? TEXT is replaced by varchar(256) by default. For now I used varchar(65535), but I'm not sure if that's the right way to do…
WeaselFox
  • 7,220
  • 8
  • 44
  • 75
24
votes
3 answers

Redshift - How to remove NOT NULL constraint?

Since Redshift does not support ALTER COLUMN, I would like to know if it's possible to remove the NOT NULL constraints from columns in Redshift.
shihpeng
  • 5,283
  • 6
  • 37
  • 63
23
votes
5 answers

Invalid digits on Redshift

I'm trying to load some data from stage to relational environment and something is happening I can't figure out. I'm trying to run the following query: SELECT CAST(SPLIT_PART(some_field,'_',2) AS BIGINT) cmt_par FROM public.some_table; The…
Maurício Borges
  • 408
  • 1
  • 3
  • 8
23
votes
4 answers

How to measure table space on disk in RedShift / ParAccel

I have a table in RedShift. How can I see how many disk-space it uses?
diemacht
  • 2,022
  • 7
  • 30
  • 44