1

We would like to import our Cloud SQL database into BigQuery to query along with other datasets we have there.

What is the best path to doing this?

Kurt Gooden
  • 43
  • 1
  • 6
  • You need to write an event shipping arhitecture. There is a startup that does have a solution https://www.stitchdata.com/integrations/google-cloud-sql/google-bigquery/ – Pentium10 Mar 14 '17 at 17:36
  • @Pentium10 Thanks, this is definitely a good option we will consider. Was hoping for something similar within the Google Suite. – Kurt Gooden Mar 16 '17 at 20:44
  • Let me know if you need some sort of help, I am not involved in the startup, but we integrate similar things. – Pentium10 Mar 16 '17 at 20:53

2 Answers2

1

You can export your data as CSV files and then load them into BigQuery.

Elliott Brossard
  • 32,095
  • 2
  • 67
  • 99
  • I have tried this approach using `bq load` and always get parsing errors on import. e.g. "Error detected while parsing row starting at position: 231897115. Error: Data between close double quote (") and field separator.". I am looking to automate this, so either the API or CLI approach needs to work. – Kurt Gooden Mar 28 '17 at 12:57
  • I wish I had a better suggestion. I know that the people on our team who work on the various data connectors are looking to bridge some of these gaps, but it doesn't address your problem in the short term. You may end up needing to export, sanitize, then import. – Elliott Brossard Mar 29 '17 at 21:36
  • Thanks for the response @elliott. The sanitization requires significant time and resources for large exports, which is obviously not ideal. I believe exposing further escaping and encoding controls on the "gcloud beta sql instances export" functions may solve these issues. Where can I suggest this? – Kurt Gooden Mar 31 '17 at 14:10
  • The [public issue tracker](https://issuetracker.google.com/issues?q=componentid:187149%2B) is a good place to file suggestions for BigQuery if you don't see an existing issue. There should be a component in the issue tracker for Cloud SQL as well. – Elliott Brossard Mar 31 '17 at 14:44
0

Shamless plug for this answer to a similar question: set up a "connection" to allow Cloud SQL Federated Queries and do all the work directly within BigQuery:

INSERT
  demo.customers (column1)
SELECT
   * 
FROM
   EXTERNAL_QUERY("project.us.connection",
                  "SELECT column1 FROM mysql_table;");
Chris
  • 6,805
  • 3
  • 35
  • 50