1

enter image description here

I have a situation where I am getting dates in two separate formats, MM/dd/yyyy & yyyy-dd-MM, AND there might be even more different formats as well in csv which will be obviously in string.

Below are the data which currently come as String from CSV-

1/14/2022 0:00
2021-12-31 00:00:00

I am using a Dataflow task in ADF to load the data into Azure SQL where the default format it uses should be yyyy-MM-dd HH:mm:ss.

how can I do this?

rbrayb
  • 46,440
  • 34
  • 114
  • 174
  • Images of data and such are pretty much worthless. [Why?](https://meta.stackoverflow.com/questions/285551/why-should-i-not-upload-images-of-code-data-errors-when-asking-a-question#:~:text=Images%20are%20harder%20to%20read,actual%20code%20and%20formatting%20it.) – Sean Lange Aug 10 '22 at 16:49
  • 1
    With a totally inconsistent format and placement within the string you are fighting a losing battle here. It is going to be nearly impossible to accurately identify and parse dates embedded inside a string, and adding the complexity of random formats pushes this way over the top. – Sean Lange Aug 10 '22 at 16:51
  • Updated the question with the date which comes as string from csv. So would u suggest telling client to have a constant date format in csv ? I did try changing the format in output of sink but it seems like one type of data is getting changed only – thesacredkiller Aug 10 '22 at 18:09
  • _So would u suggest telling client to have a constant date format in csv?_ Well, yes, that's a given. It's fairly obvious that `1/14/2022 0:00` is meant to be 14-Jan-2022, but what is `1/3/2022`? Is that 3-Jan-2022 or 1-Mar-2022? – AlwaysLearning Aug 10 '22 at 21:31
  • Can you please list all the possibilities of a date format that you can get in your data? – Sally Dabbah Aug 15 '22 at 05:30
  • @SallyDabbah i have already mentioned the two formats that are coming through the csv. Kindly take those combinations for now – thesacredkiller Aug 20 '22 at 03:31

1 Answers1

0

ok, i managed to build a quick demo.

Main idea of my solution:

you need to differentiate between valid rows and rows that needs to be modified. in order to do so, i used case condition. the idea is to add a derived column with a name 'Date' and modify only needed rows.

Input Data:

i created a csv file and saved my data as a dataset in ADF.

enter image description here

ADF:

enter image description here

In source, i select my dataset as an input.

in a derived column activity:

enter image description here

added a new derived column with a name 'Date' , value :

case(contains(split(Date,''),#item=='/'), toString(toTimestamp(Date,'MM/dd/yyyy H:mm'),'yyyy-MM-dd HH:mm:SS'), Date)

in toTimestamp method, i added first the dateFormat of my input Date and in toString the desired format that i want to cast the date to it.

Output: enter image description here

P.s You can cast all possible date formats that will appear in your data in that way.

you can read more about it here: https://learn.microsoft.com/en-us/azure/data-factory/data-flow-expressions-usage#toTimestamp

Sally Dabbah
  • 411
  • 2
  • 8