3

I have several questions regarding the Google Spanner Export / Import tool. Apparently the tool creates a dataflow job.

  1. Can an import/export dataflow job be re-run after it had run successfully from the tool? If so, will it use the current timestamp?

  2. How to schedule a daily backup (export) of Spanner DBs?

  3. How to get notified of new enhancements within the GCP platform? I was browsing the web for something else and I noticed that the export / import tool for GCP Spanner had been released 4 days earlier.

I am still browsing through the documentation for dataflow jobs and templates, etc.. Any suggestions to the above would be greatly appreciated.

Thx

Maxim
  • 4,075
  • 1
  • 14
  • 23
Dave
  • 123
  • 2
  • 13
  • I had some issues with exporting during DataFlow execution, where I could see at Stackdriver logs that it was failing during the workers startup, cause my project was under a private folder, and for security reasons, it's configured to suppress some external IP creation. I realize a recent update on GCP that allows the possibility to suppress public IP creation on workers, but didn't test it yet: https://cloud.google.com/dataflow/docs/guides/specifying-networks#public_ip_parameter – manasouza Nov 15 '18 at 16:58

1 Answers1

0

My response based on limited experience with the Spanner Export tool.

  1. I have not seen a way to do this. There is no option in the GCP console, though that does not mean it cannot be done.

  2. There is no built-in scheduling capability. Perhaps this can be done via Google's managed Airflow service, Cloud Composer (https://console.cloud.google.com/composer)? I have yet to try this, but it is next step as I have similar needs.

  3. I've made this request to Google several times. I have yet to get a response. My best recommendation is to read the change logs when updating the gcloud CLI.

Finally-- there is an outstanding issue with the Export tool that causes it to fail if you export a table with 0 rows. I have filed a case with Google (Case #16454353) and they confirmed this issue. Specifically:

After running into a similar error message during my reproduction of the issue, I drilled down into the error message and discovered that there is something odd with the file path for the Cloud Storage folder [1]. There seems to be an issue with the Java File class viewing ‘gs://’ as having a redundant ‘/’ and that causes the ‘No such file or directory’ error message.

Fortunately for us, there is an ongoing internal investigation on this issue, and it seems like there is a fix being worked on. I have indicated your interest in a fix as well, however, I do not have any ETAs or guarantees of when a working fix will be rolled out.

dsquier
  • 268
  • 1
  • 9