I am using Google Cloud Platforms (GCP) Data Prep (DP) to move data into Big Query (BQ) via AVRO files. I am taking the data straight from a CSV file to a AVRO file using one DP recipe with NO transformations. In DP the type of my column CreatedDate
is date/time
as seen in this picture:
NOTE: The year (in the format YYYY-) is painted out.
When I publish the the data into a AVRO file using these settings:
The resulting AVRO schema looks like this in plain text:
{"name":"CreatedDate","type":["null","string"],"default":null}
And when imported into BQ is also a column of type string
However, if I publish the data straight to BQ using the Replace-BigQuery
publish option in DP theCreatedDate
column will be of type DATETIME
and be NULLABLE
which is exactly what I want.
I looked around and could not find any know issues publishing from DP to an AVRO file that would make datetime fields into string fields.
Did I miss anything?
Does AVRO not support datetime or datetime in this format like BQ does?
Yes I need to have DP publish to a AVRO file. I just did the direct publish to BQ as a test. I can not do this for the long term.
Any other suggestions/help would be wonderful!