2

I am exporting F&O D365 data to ADLS in CSV format. Now, I am trying to read the CSV stored in ADLS and copy into Azure Synapse dedicated SQL pool table using Azure data factory. However, I can create the pipeline and it's working for few tables without any issue. But it's failing for one table (salesline) because of mismatch in number of column. Below is the CSV format sample, there is no column name(header) in CSV because it's exported from F&O system and column name stored in salesline.CDM.json file.

5653064010,,,"2022-06-03T20:07:38.7122447Z",5653064010,"B775-92"

5653064011,,,"2022-06-03T20:07:38.7122447Z",5653064011,"Small Parcel"

5653064012,,,"2022-06-03T20:07:38.7122447Z",5653064012,"somedata"

5653064013,,,"2022-06-03T20:07:38.7122447Z",5653064013,"someotherdata",,,,test1, test2

5653064014,,,"2022-06-03T20:07:38.7122447Z",5653064014,"parcel"

5653064016,,,"2022-06-03T20:07:38.7122447Z",5653064016,"B775-92",,,,,,test3

I have created ADF pipeline using copy data activity to copy the data from ADLS(CSV) to Synapse SQL table however I am getting below error.

Operation on target Copy_hs1 failed: ErrorCode=DelimitedTextMoreColumnsThanDefined,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error found when processing 'Csv/Tsv Format Text' source 'SALESLINE_00001.csv' with row number 4: found more columns than expected column count 6.,Source=Microsoft.DataTransfer.Common,'

Column mapping looks like below- Because CSV first row has 6 column so it's appearing 6 only while importing schema.

enter image description here

Ajay
  • 247
  • 1
  • 5
  • 15
  • Hi, can you share mappings of copy activity and the structure/columns of your synapse table? – NiharikaMoola-MT Jun 21 '22 at 12:41
  • @NiharikaMoola-MT- Thanks for reply. Column mapping updated. – Ajay Jun 21 '22 at 13:04
  • I see rows 4 and 6 in your sample data have more columns than other rows. Do you want to exclude those additional columns? – NiharikaMoola-MT Jun 21 '22 at 13:16
  • @NiharikaMoola-MT- Yes, you are correct. But because in first row only 6 column so when I am importing source schema, it's importing 6 column only. Because it's small sample data so here I can add some more sample data(column) in first row and resolve it. But if I have csv which has around 300 column at that time it's difficult to identify and may be in future again there is chance to add column in CSV in between row then it'll again fail. Probably of adding column is high because it's getting exported from Finance and Operations system. – Ajay Jun 21 '22 at 13:25
  • So I am looking for option to fix it if there is any mismatch in number of column then still my pipeline should work for existing mapping without any fail. – Ajay Jun 21 '22 at 13:27

3 Answers3

1

I have repro’d with your sample data and got the same error while copying the file using the copy data activity.

Alternatively, I have tried to copy the file using data flow and was able to load the data without any errors.

Source file:

enter image description here

Data flow:

  1. Source dataset: only the first 6 columns are read as the first row contains only 6 columns in the file.

enter image description here

  1. Source transformation: connect source dataset in source transformation.

enter image description here

Source preview:

enter image description here

  1. Sink transformation: Connect sink to synapse dataset.

enter image description here

Settings:

enter image description here

Mappings:

enter image description here

Sink output:

enter image description here

  1. After running the data flow, data is loaded to the sink synapse table.

enter image description here

NiharikaMoola-MT
  • 4,700
  • 1
  • 3
  • 15
  • Correct, I have started the using the Data Flow... But I am not sure in future if I need to map the column #8 which is not present in first row then how I will map. – Ajay Jun 23 '22 at 10:17
  • If you do not have the column header then you have to add the columns manually by editing mapping in the sink as it will not match the auto-detect the column names with sink column names. – NiharikaMoola-MT Jun 23 '22 at 10:20
0

Change my csv to xlsx help me to solve this problem in Copy Activity ADF.

-1

1.From Copy data settings set "Fault Tolerance" = "Skip Incompatible rows" skip incompatible rows 2.From Dataset connection settings set Escape character to Double quotes Escape character