2

I am using univocity to parse an large (6 GB) CSV in java. The CSV enrty is as below and can parse CSV. Any idea how to generate the output as below:

CsvParserSettings settings = new CsvParserSettings();   
settings.getFormat().setLineSeparator("\n");

CsvParser parser = new CsvParser(settings);

File f = new File("test.csv");
parser.beginParsing(f, "UTF-8");


String[] row;

while ((row = parser.parseNext()) != null) {

       String val = Arrays.toString(row);

       val = val.replaceAll("\\[", "");
       val = val.replaceAll("\\]", "");
       val = val.replaceAll("\\s", "");


       System.out.println(val);


} // end while

test.csv content:

A,10,2,3

null,11,A1,null

null,30,A23,null

null,44,A34,null

null,16,A67,null

A,20,5,6

null,41,A100,null

null,60,A56,null

null,74,A34,null

null,86,A56,null

Trying to get output like below:

A,[10;11;30;44;16],[2,A1,A23,A34,A67],3

A,[20;41;60;74;86],[5,A100,A56,A34,A56],6

1 Answers1

1

Each line of expected output depends on multiple rows. Each cell value should be stored in an intermediate variable. Accordingly code can be written as follows:

    BufferedReader csv = new BufferedReader(new FileReader("test.csv"));

    String line;

    ArrayList<String> ar1 = new ArrayList<String>();
    ArrayList<String> ar2 = new ArrayList<String>();

    String s1=null,s2=null;

    String[] lineSplit;

    while ((line = csv.readLine()) != null){

        lineSplit = line.split(",");
        if(lineSplit.length>1){ 
            if(!lineSplit[0].equals("null")){

                if(!ar1.isEmpty()){

                    System.out.println(s1+","+ar1.toString().replaceAll(", ", ";")
                                       +","+ar2.toString().replaceAll(", ", ",")+","+s2);
                }

                s1 = lineSplit[0] ;
                s2 = lineSplit[3];
                ar1 = new ArrayList<String>();
                ar1.add(lineSplit[1]);
                ar2 = new ArrayList<String>();
                ar2.add(lineSplit[2]);
            }
            else{
                ar1.add(lineSplit[1]);
                ar2.add(lineSplit[2]);
            }
        }
    }

    System.out.println(s1+","+ar1.toString().replaceAll(", ", ";")
               +","+ar2.toString().replaceAll(", ", ",")+","+s2);

    csv.close();
Nithin
  • 748
  • 1
  • 10
  • 27
  • I have tried and it worked perfect. Great! work Nithin. Thanks! – user2419320 Jan 30 '18 at 15:54
  • Hi Nitin Can you please help how to achieve below? Input: A,B,C,D,E 101,a1,b1,c1,d1 101,a2,b2,c2,d2 101,a3,b3,c3,d3 102,a21,b21,c21,d21 102,a22,b22,c22,d22 102,a23,b23,c23,d23 Output need to like: 101,[a1;a2;a3],[b1;b2;b3],[c1;c2;c3],[d1;d2;d3] 102,[a21;a22;a23],[b21;b22;b23],[c21;c22;c23],[d21;d22;d23] – user2419320 Jan 31 '18 at 08:07