1

I'm trying to use Azure Data Factory to upsert a CSV into an Azure SQL table. All seemed well until I checked the results. One of the columns is a nullable date. The CSV contains a value like so 1/2/2020 12:00:00 AM. The data flow silently inserts a NULL instead of throwing an error because it didn't like the input. So how can I get my data flow to convert the string to a datetime properly, and then to error out on issues like this in the future? I really don't want silent failures and bad data.

John
  • 2,653
  • 4
  • 36
  • 57

1 Answers1

2

The null value is due to incompatible date formats in ADF. You need to do date format conversion.

Is your source date format like this MM/dd/yyyy HH:mm:ss?
If so, you can use Derived column and add the expression toString(toTimestamp(<Your_Column_Name>,'MM/dd/yyyy HH:mm:ss'),'yyyy-MM-dd HH:mm:SS') to format this column to String. It solved the NULL value. Of course you can choose what the date format you want.

I made a test as follows:

  1. My data source is from a csv file and the EmpDate is a date type like yours and last row contains a null value. enter image description here

  2. Then I add the expression toString(toTimestamp(EmpDate,'MM/dd/yyyy HH:mm:ss'),'yyyy-MM-dd HH:mm:SS') in the Derived column activity. Here you can choose the date format what you want. enter image description here

3.According to Mark Kromer's suggestion, I Add Conditional Split directly after the Derived Column and check for isNull(EmpDate). Here I use not(isNull(EmpDate)) expression. enter image description here

  1. In the end, if the EmpDate contains null value, it will go to sink2 else go to sink1. enter image description here The row contains null value: enter image description here
Joseph Xu
  • 5,607
  • 2
  • 5
  • 15
  • Adding on to Joseph's answer ... if you don't want your data flow to sink rows that fail this casting, then use a Conditional Split directly after the Derived Column and check for isNull(EmpDate). If null, then write to an error log. Else, continue on normal path. – Mark Kromer MSFT Aug 28 '20 at 03:36
  • @MarkKromer thanks! But is there a way to set it as a default that if a column fails to convert that the whole dataflow fails? – John Aug 31 '20 at 15:58
  • Default is continue on casting failure, set to NULL. To set your own custom behavior, you'll use this technique. – Mark Kromer MSFT Aug 31 '20 at 16:42
  • Thanks @Mark Kromer, according to his advice, I've updated my answer. – Joseph Xu Sep 01 '20 at 02:56
  • Hope my answer is helpful for you @John . – Joseph Xu Sep 01 '20 at 02:56
  • @JosephXu Thank you for your well-researched answer. Actually I found a workaround. I didn't realize that Azure SQL supports the BulkCopy command through ADO.NET. I thought it didn't in the past. I have existing code that BulkCopys' to a temp table and then merges the rows 4000 at a time so as not to lock the table. Thank you, though! – John Sep 29 '20 at 13:40