Questions tagged [amazon-redshift-spectrum]

Using Amazon Redshift Spectrum, you can query and retrieve structured and semistructured data from files in Amazon S3 without having to load the data into Amazon Redshift tables. Redshift Spectrum queries employ massive parallelism to execute very fast against large datasets.Multiple clusters can concurrently query the same dataset in Amazon S3 without the need to make copies of the data for each cluster.

291 questions
2
votes
2 answers

Redshift Data API query statement size limited to 100 KB

Issue: Our Redshift Java SDK queries return ValidationException: software.amazon.awssdk.services.redshiftdata.model.ValidationException: Cannot process query string larger than 100kB Details: Our system may generate many SQL SELECTs connected…
2
votes
0 answers

Amazon Redshift Spectrum external csv table with dynamic input csv column order

I created an external table create external table a_schema.my_table( col_a varchar(10), col_b varchar(10), ) row format delimited …
2
votes
1 answer

AWS Spectrum Scan Error Unexpected end of compressed file

I want to use AWS Spectrum - Querying on Redshift based on a file in S3. Since you can either choose a folder in S3 or a JSON file, I opted to use a JSON file as the location. The bug: When I reference the file test in a folder - Redshift works…
2
votes
1 answer

Connect Redshift Spectrum/ AWS EMR with Hudi directly or via AWS Glue Data Catalog

I'm trying to understand how to properly connect Redshift Spectrum with Hudi data. Looks like I can directly create Redshift external table for data managed in Apache Hudi like it is described by the following documentation…
2
votes
0 answers

Query AWS-Glue metadata (column comment and table description) with redshift

My table's metadata is on Glue, with a description and comments on the columns, as it shows at the picture bellow. I would like to retrieve these data through Redshift. Is it possible...? Thanks!
2
votes
0 answers

How to insert data into table which is having new line character in SQL?

Hi I am creating external table which is loading data from S3 Bucket file. But for some column I am getting CRLF due to which data is going to another row and not loading perfectly. Could you please help me how can I resolve this? Example: Draft…
2
votes
1 answer

Storing Timestamp with Timezone in Redshift External Table

I need to store timezone info with my timestamp column in an Redshift external table. I am using the below commands: Create external table: CREATE EXTERNAL TABLE schema.test ( user_id BIGINT, created_by BIGINT, created_date…
2
votes
0 answers

Load special characters in AWS Spectrum Table

I am trying to create an external Spectrum table on top of plain text files but some values are considered as null because they contain special characters. Create statement: create external table s.table_1 ( id bigint, city…
2
votes
0 answers

How to do an incremental load/upsert in spark-redshift

I have an ETL pipeline where data coming from redshift, reading the data in (py)spark dataframes, performing calculations and dumping back the result to some target in redshift. So the flow is => Redshift source schema--> Spark 3.0 --> Redshift…
2
votes
1 answer

Unable to create redshift connection

I am trying to create the redshift connection using redshift jdbc driver which I downloaded from AWS redshift cluster console. Getting below exception java.sql.SQLException: The connection attempt failed. at…
2
votes
0 answers

Querying Athena Views using Spectrum

I have created a view in Athena and see it in my Glue Data Catalog. I would like to access the view via Redshift Spectrum/Glue Catalog Sharing. Per the AWS documentation: If you have created Athena views in the Data Catalog, then Data Catalog…
2
votes
0 answers

Allow AWS Redshift Cluster DB user group to assume role

Instead of allowing individual dbuser user, how can I allow Redshift dbgroup to allow accessing other AWS resource? Explanation: Currently, we have a role that allows Redshift Spectrum to query data in our S3 buckets. Also, we have a dbuser say…
2
votes
2 answers

Facing access denied on querying struct columns

I am able to query my table using Redshift spectrum. However, when I try to access a column, defined as a struct, I am getting the following error: ERROR: Spectrum Scan Error: S3ServiceException:Access Denied,Status 403,Error AccessDenied Any idea…
2
votes
2 answers

Redshift spectrum - Updating external spectrum table column type

I've created an external table having 4 columns. One of the column is of custom datatype. create EXTERNAL table public.test_table_1( uuid varchar(36), event_id varchar(36), last_updated_timestamp bigint, user_app struct
2
votes
1 answer

Redshift JSONPaths file for dynamic json file

Given the below json object { "player": { "francesco totti": { "position": "forward" }, "andrea pirlo": { "position": "midfielder" } } } I would like to import the above file into Redshift as the below rows name,…
pippa dupree
  • 155
  • 1
  • 10