Found more columns than expected column count in Azure data factory while reading CSV stored in ADLS

Question

I am exporting F&O D365 data to ADLS in CSV format. Now, I am trying to read the CSV stored in ADLS and copy into Azure Synapse dedicated SQL pool table using Azure data factory. However, I can create the pipeline and it's working for few tables without any issue. But it's failing for one table (salesline) because of mismatch in number of column. Below is the CSV format sample, there is no column name(header) in CSV because it's exported from F&O system and column name stored in salesline.CDM.json file.

5653064010,,,"2022-06-03T20:07:38.7122447Z",5653064010,"B775-92"

5653064011,,,"2022-06-03T20:07:38.7122447Z",5653064011,"Small Parcel"

5653064012,,,"2022-06-03T20:07:38.7122447Z",5653064012,"somedata"

5653064013,,,"2022-06-03T20:07:38.7122447Z",5653064013,"someotherdata",,,,test1, test2

5653064014,,,"2022-06-03T20:07:38.7122447Z",5653064014,"parcel"

5653064016,,,"2022-06-03T20:07:38.7122447Z",5653064016,"B775-92",,,,,,test3

I have created ADF pipeline using copy data activity to copy the data from ADLS(CSV) to Synapse SQL table however I am getting below error.

Operation on target Copy_hs1 failed: ErrorCode=DelimitedTextMoreColumnsThanDefined,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error found when processing 'Csv/Tsv Format Text' source 'SALESLINE_00001.csv' with row number 4: found more columns than expected column count 6.,Source=Microsoft.DataTransfer.Common,'

Column mapping looks like below- Because CSV first row has 6 column so it's appearing 6 only while importing schema.

Hi, can you share mappings of copy activity and the structure/columns of your synapse table? — NiharikaMoola-MT, Jun 21 '22 at 12:41
@NiharikaMoola-MT- Thanks for reply. Column mapping updated. — Ajay, Jun 21 '22 at 13:04
I see rows 4 and 6 in your sample data have more columns than other rows. Do you want to exclude those additional columns? — NiharikaMoola-MT, Jun 21 '22 at 13:16
@NiharikaMoola-MT- Yes, you are correct. But because in first row only 6 column so when I am importing source schema, it's importing 6 column only. Because it's small sample data so here I can add some more sample data(column) in first row and resolve it. But if I have csv which has around 300 column at that time it's difficult to identify and may be in future again there is chance to add column in CSV in between row then it'll again fail. Probably of adding column is high because it's getting exported from Finance and Operations system. — Ajay, Jun 21 '22 at 13:25
So I am looking for option to fix it if there is any mismatch in number of column then still my pipeline should work for existing mapping without any fail. — Ajay, Jun 21 '22 at 13:27

score 1 · Accepted Answer · answered Jun 23 '22 at 10:11

1

I have repro’d with your sample data and got the same error while copying the file using the copy data activity.

Alternatively, I have tried to copy the file using data flow and was able to load the data without any errors.

Source file:

Data flow:

Source dataset: only the first 6 columns are read as the first row contains only 6 columns in the file.

Source transformation: connect source dataset in source transformation.

Source preview:

Sink transformation: Connect sink to synapse dataset.

Settings:

Mappings:

Sink output:

After running the data flow, data is loaded to the sink synapse table.

answered Jun 23 '22 at 10:11

NiharikaMoola-MT

4,700
1
3
15

Correct, I have started the using the Data Flow... But I am not sure in future if I need to map the column #8 which is not present in first row then how I will map. – Ajay Jun 23 '22 at 10:17
If you do not have the column header then you have to add the columns manually by editing mapping in the sink as it will not match the auto-detect the column names with sink column names. – NiharikaMoola-MT Jun 23 '22 at 10:20

score 0 · Answer 2 · answered Jul 20 '22 at 14:35

0

Change my csv to xlsx help me to solve this problem in Copy Activity ADF.

answered Jul 20 '22 at 14:35

José González

1

score -1 · Answer 3 · answered Sep 23 '22 at 08:48

-1

1.From Copy data settings set "Fault Tolerance" = "Skip Incompatible rows" skip incompatible rows 2.From Dataset connection settings set Escape character to Double quotes Escape character

answered Sep 23 '22 at 08:48

synapsenewbee

1

Found more columns than expected column count in Azure data factory while reading CSV stored in ADLS

3 Answers3