1

I wanted to replicate Mysql tables held in GCP Compute Engine to the GC BigQuery. I referred this document : https://cloud.google.com/data-fusion/docs/tutorials/replicating-data/mysql-to-bigquery. so I Decided to use GCP Data Fusion for the Job.

Everything works fine and the Data is replicated in Bigquery. So I was testing different datatype support for this Replication.

Where I come up with issue in this Replication Pipeline, So whenever I try to put the 'DATE' datatype Column for the Data fusion replication, the whole table (Which contain 'DATE' Column) doesn't show up in BigQuery

It Creates the table with schema same as source and 'Date' datatype also present in Bigquery, and I have use the same Date format as supported by BigQuery.

I also gone through Data fusion logs, It shows pipeline is Loading the data perfectly fine into BigQuery, Also catches the new rows added into Mysql Table from source Mysql DB with inserts and updates as well. But somehow rows are not getting into Bigquery.

Did anyone used Data fusion Replication with 'Date' column Datatype ? Is this issue with BigQuery or Data Fusion ? Do I need to provide any manual setting in the BigQuery ? Can anyone please provide inputs on this ?

Mk_cloud
  • 21
  • 4
  • Datetime support for replication pipelines is tracked by https://cdap.atlassian.net/browse/PLUGIN-645. – user3126412 Apr 29 '21 at 21:42
  • (I'm using 6.4 - cdap) Datetime is working, as it is considered as 'TIMESTAMP' in the Data Fusion replication pipeline. – Mk_cloud Apr 30 '21 at 09:39

2 Answers2

1

I'll mark this issue as Resolve. Since the Problem is with Data Fusion, and the latest version 6.4.1 now supports Datatime datatype while replicating in BigQuery. I'm Receiving correct date & datetime data now.

Thank you for all the Help :)

Mk_cloud
  • 21
  • 4
0

I used following schema which had Date field in it.

create table tutorials_tbl(tutorial_id INT NOT NULL AUTO_INCREMENT, tutorial_title VARCHAR(100) NOT NULL, tutorial_author VARCHAR(40) NOT NULL, submission_date DATE, PRIMARY KEY ( tutorial_id ));

When I run the replication pipeline, I see BQ table is created with following schema:

enter image description here

I also see the events in the table:

enter image description here

Can you please share the input table schema? You can also check the Job History and Query History tabs under BQ table to see if any errors. enter image description here

Sagar Kapare
  • 196
  • 4
  • Schema is created in bigquery in my case as well. but, have you tried replicating the Data/ rows in table ? or inserting rows in the table with Date datatype column? – Mk_cloud May 04 '21 at 10:20
  • Hi.. yes I tried adding couple of rows - check the above screenshot where data is in BQ for tutorial_id 1 and 2 with submission_date as Date type. Can you share your table schema, pipeline logs? Also did you check if any error on bq side in Job and Query history tabs? – Sagar Kapare May 04 '21 at 21:42