4

I am using Apache Commons CSV lib to write CSV files.

The sample provided to me had a strange pattern.

Sample output expected:

  • Name with designation,Phone,Action,Date
  • "John Doe,Officer", ,Under investigation,8-Jun-2017
  • Jack,+123-4567,False Allegation ,4-Jun-2017

As can be seen, the Name with designation column can have values with comma. So they needs to be quoted. However there are values where there are empty spaces, such as, phone no, which can be only empty space, or the Action column, where values can contain empty space (middle or end).

Now when I write the CSV using apache commons library, I used the following CSVFormat with the CSVPrinter class.

CSVFormat.EXCEL.withQuoteMode(QuoteMode.MINIMAL));

This configuration gave the closest output as like the sample. However, the Empty space or the values with the trailing spaces, that is, even when there is no comma, also gets quoted.

My Output:

  • Name with designation,Phone,Action,Date
  • "John Doe,Officer"," ",Under investigation,8-Jun-2017
  • Jack,+123-4567,"False Allegation ",4-Jun-2017

What I need is, when there is only space or space at end, and the value does not have comma, the Quotes will not be there.

Is there any configuration in Apache Commons that I am missing? Or is there any other CSV library with a format that gives this output?

n0ahz
  • 107
  • 9

2 Answers2

1

univocity-parsers does what you want. Try this code:

    CsvWriterSettings settings = Csv.writeExcel();
    settings.trimValues(false); //values are trimmed by default
    settings.setHeaders("Name with designation","Phone","Action","Date");
    settings.setHeaderWritingEnabled(true);

    StringWriter output = new StringWriter();
    CsvWriter writer = new CsvWriter(output, settings);

    writer.writeRow("John Doe,Officer"," ","Under investigation","8-Jun-2017");
    writer.writeRow("Jack","+123-4567","False Allegation ","4-Jun-2017");

    writer.close();

    System.out.println(output);

The output will be:

Name with designation,Phone,Action,Date
"John Doe,Officer", ,Under investigation,8-Jun-2017
Jack,+123-4567,False Allegation ,4-Jun-2017

Hope it helps.

Disclaimer: I'm the author of this library. It's open-source and free (Apache 2.0 license)

Jeronimo Backes
  • 6,141
  • 2
  • 25
  • 29
  • Cool lib! Will give it a try too :)....I ended up using apache-commons after posting this question which also served the purpose. – n0ahz Jun 16 '18 at 12:46
  • ** ended up using super-csv – n0ahz Jun 16 '18 at 12:54
  • Thanks for trying. If performance ever becomes a concern then have a look at it again. Here's a comparison among a lot of different parsers: https://github.com/uniVocity/csv-parsers-comparison – Jeronimo Backes Jun 16 '18 at 13:40
1

You can use super-csv. That follows the RFC-4180

 CsvListWriter c = new CsvListWriter(new PrintWriter(System.out), CsvPreference.STANDARD_PREFERENCE);
 c.write(Lists.newArrayList(" Aa", " ", " John \"Doe\"", "Comma,", "test "));

 c.flush();
 c.close();

Writes:

 Aa, ," John ""Doe""","Comma,",test 

Maven depencency:

<!-- https://mvnrepository.com/artifact/net.sf.supercsv/super-csv -->
<dependency>
    <groupId>net.sf.supercsv</groupId>
    <artifactId>super-csv</artifactId>
    <version>2.4.0</version>
</dependency>
Community
  • 1
  • 1
Rob Audenaerde
  • 19,195
  • 10
  • 76
  • 121
  • That's the lib I ended up using after posting this question. Although apache-commons is more easier to use with its wrapper calls, super-csv did the job too. Found out there was a code inside apache-commons which puts quote around values whenever the last character code is <= 32...odd!..not sure if this is any CSV standard. – n0ahz Jun 16 '18 at 12:42