2

I am using Google Cloud Platforms (GCP) Data Prep (DP) to move data into Big Query (BQ) via AVRO files. I am taking the data straight from a CSV file to a AVRO file using one DP recipe with NO transformations. In DP the type of my column CreatedDate is date/time as seen in this picture:

enter image description here

NOTE: The year (in the format YYYY-) is painted out.

When I publish the the data into a AVRO file using these settings:

The resulting AVRO schema looks like this in plain text:

{"name":"CreatedDate","type":["null","string"],"default":null}

And when imported into BQ is also a column of type string

However, if I publish the data straight to BQ using the Replace-BigQuery publish option in DP theCreatedDate column will be of type DATETIME and be NULLABLE which is exactly what I want.

I looked around and could not find any know issues publishing from DP to an AVRO file that would make datetime fields into string fields.

Did I miss anything?

Does AVRO not support datetime or datetime in this format like BQ does?

Yes I need to have DP publish to a AVRO file. I just did the direct publish to BQ as a test. I can not do this for the long term.

Any other suggestions/help would be wonderful!

PeterH
  • 858
  • 1
  • 6
  • 15

0 Answers0