-1

I have a use case where I need to copy PDF files stored in a Google Cloud Storage Bucket to an SFTP I have setup. The specific file names and locations within the bucket would be located in a Google BigQuery view.

My experience is limited, and I've been trying to figure this out using the tools I'm familiar with, but am struggling. These tools I feel comfortable with are:

  • BigQuery
  • Microsoft Flow
  • SFTP
  • WinSCP on a VirtualMachine using CMD prompts and .bat files

Any help with this would be grately appreciated, expecially if the tools listed above coule be used.

2 Answers2

0

You could achieve this with Cloud Workflows,

Essentially you need to create a Workflow that

  • runs the BigQuery select statement (example here)
  • loop over the entries
  • start a Cloud Build job that issue the relevant commands
    • download the file from GCS (similar to this workflow)
    • get the SFTP credentials from Secret Manager (similar to this)
    • issue the upload file to SFTP commands

a good source for all these are parts from my two articles that you can find linked. Reach me out in private if you need help putting together.

Pentium10
  • 204,586
  • 122
  • 423
  • 502
-1

Looks like you need to perform the following steps:

  1. collect file names and paths by reading a bigquery table/view
  2. for each file then download from GCS and upload on FTP server

Creating a script should be very simple and may be a great option for writing robust and testable procedure.

You can use Python (but also Java, Go, Bash...) for this as you can easily interact with Google BigQuery and Cloud Storage via API.

Then the FTP upload will follow based on the language you choose.

References:

  1. Read data from bigquery: https://cloud.google.com/bigquery/docs/bigquery-storage-python-pandas
  2. Download data from GCS: https://cloud.google.com/storage/docs/reference/libraries#client-libraries-install-python
Gumaz
  • 239
  • 1
  • 4