0

I'm in need to move my bigquery table to redshift.

Currently I have a python job that is fetching data from redshift, and it is incremental loading my data on the redshift.

This python job is reading bigquery data, creating a csv file in the server, drops the same on s3 and the readshift table reads the data from the file on s3. But now the time size would be very big so the server won't be able to handle it.

Do you guys happen to know anything better than this ?

The new 7 tables on bigquery I would need to move, is around 1 TB each, with repeated column set. (I am doing an unnest join to flattening it)

Subhamoy
  • 1
  • 1
  • 2
  • Extra context: https://www.reddit.com/r/bigquery/comments/cbkkrc/moving_bigquery_data_to_redshift/ – Felipe Hoffa Jul 16 '19 at 04:39
  • If your CSV is too big or is slowing down the process... can you subdivide or query your BQ source tables into smaller files or incremental files? – rtenha Jul 16 '19 at 19:42

1 Answers1

0

You could actually move the data from Big Query to a Cloud Storage Bucket by following the instructions here. After that, you can easily move the data from the Cloud Storage bucket to the Amazon s3 bucket by running:

gsutil rsync -d -r gs://your-gs-bucket s3://your-s3-bucket

The documentation for this can be found here

siamsot
  • 1,501
  • 1
  • 14
  • 20