1

I managed to have MySQL tables replicated into BigQuery fairly easily by following this article on Cloud Data Fusion Replication. However, there's an issue with the DateTime columns. All the DateTime columns have been replicated into BigQuery using a 1970's date. Does anyone know how to fix this?

Here is the original MySQL data:
MySQL table And here's the replicated data in BigQuery BigQuery table

Simon Corcos
  • 962
  • 14
  • 31
  • 1
    May I ask what datatypes are used in your MySQL and BigQuery tables? DateTime support (direct conversion) for CDAP was added recently in 6.4, and proper conversions are not yet implemented in replication yet. The effort for supporting BigQuery DateTime conversions in replication is being tracked in [this issue](https://cdap.atlassian.net/browse/PLUGIN-645). – Dennis Li Apr 08 '21 at 20:57
  • Awesome thank you @DennisLi! The mysql column is a DateTime(3) and the BigQuery column is automatically generated as a TimeStamp. – Simon Corcos Apr 09 '21 at 16:26

1 Answers1

1

I figured another way. You can simulate MySQL replication into BigQuery by making your own batch pipeline, then schedule that pipeline to run at the frequency you want. The MySQL setup is easy to do. Just follow the instructions to install the MySQL driver here. Then you setup your MySQL data source and your BigQuery Sink. The DateTime columns in MySQL should be marked as TimeStamps and their corresponding columns in BigQuery must be of type DateTime.

MySQL Data Source Configs enter image description here

BigQuery Sink Configs enter image description here

Finally, you can make a BigQuery Execution Action before the MySQL Source to fetch the id or time of the latest record you have replicated.

enter image description here

Simon Corcos
  • 962
  • 14
  • 31