1

I am mostly using uniVocity as a CSV parser, its a excellent parser. I have hit below problem with below rows, this file will have fixed number of 7 columns. The problem is with client name, it can have commas, the next column to it Type, it is generally S or P.

Below is the test data,

Date,Code,Company,Client,Type,Quantity,Price
03/03/2014,500103,BHEL,PoI THROUGH DoI, Affairs,S,114100000,165.55
21/04/2017,533309,DALMI,KKR MAURITIUS CEMENT, LTD.,S,106020,2050.00
21/04/2017,533309,DALMI,KKR MAURITIUS CEMENT, LTD.,P,141740,2050.00

Above data has problem with Client name because data itself has comma and its not enclosed. below are the client names

PoI THROUGH DoI, Affairs
KKR MAURITIUS CEMENT, LTD.
KKR MAURITIUS CEMENT, LTD.

Could you please let me know how to handle it

Thanks

bobby.dreamer
  • 366
  • 4
  • 19

1 Answers1

0

You can't really do a lot here if the data doesn't come enclosed with quotes. All you can realistically do is to check the row length and if it is greater than 7 you know that the extra columns are part of the client name.

Here is my solution:

for (String[] row : rows) {
        if (row.length > 7) {
            int extraColumns = row.length - 7; //we have extra columns
            String[] fixed = new String[7]; // let's create a row in the correct format

            //copies all data before name
            for (int i = 0, j = 0; i < row.length; i++, j++) {
                fixed[j] = row[i]; //keep copying values, until we reach the name

                if (i == 3) { //hit first column with a name in it
                    for (int k = i + 1; k <= i + extraColumns; k++) { //append comma and the value that follows the name
                        fixed[i] += ", " + row[k];
                    }

                    i += extraColumns; //increase variable i and keep assigning values after it to position j
                }
            }
            row = fixed; //replaces the original broken row
        }

        //prints the resulting row, values in square brackets for clarity.
        for (String element : row) {
            System.out.print('[' + element + ']' + ",");
        }
        System.out.println();
    }

This produces the output:

[Date],[Code],[Company],[Client],[Type],[Quantity],[Price],
[03/03/2014],[500103],[BHEL],[PoI THROUGH DoI, Affairs],[S],[114100000],[165.55],
[21/04/2017],[533309],[DALMI],[KKR MAURITIUS CEMENT, LTD.],[S],[106020],[2050.00],
[21/04/2017],[533309],[DALMI],[KKR MAURITIUS CEMENT, LTD.],[P],[141740],[2050.00],

Hope it helps.

Jeronimo Backes
  • 6,141
  • 2
  • 25
  • 29