5

I'm tryting to learn the nio 2 package in Java 7 and i stumbled upon the Files.readAllLines(Path p, Charset cs) method. I find it very useful, but i'm of the opinion that there should be a version without the cs parameter, just like :

 public static List<String> readAllLines(String path)
    throws IOException
{ return readAllLines(Paths.get(path), Charset.defaultCharset());}

I convinced that most of the time the method will be called with the default Charset anyway, so why no the shorcut. Is there anything i'm missing about charsets that would justify not having this method? I'm quite surprised because Scala has this option:

Source.fromFile("fileName").getLines

so i don't see why Java shouldn't. Any views?

Chirlo
  • 5,989
  • 1
  • 29
  • 45
  • 3
    Perhaps they wanted to discourage using the default charset, or they wanted to minimise the number of methods added. – Peter Lawrey Oct 03 '12 at 09:05
  • 3
    Too bad the downvoter didn't comment why – Oliver Oct 03 '12 at 09:06
  • 2
    Assuming default character sets is what got the universe into character encoding hell to begin with. – Isaac Oct 03 '12 at 09:17
  • 1
    @OliverStutz, maybe he was one of the nio2 developers :) – Chirlo Oct 03 '12 at 09:18
  • News flash: [`readAllLines(String path)`](https://docs.oracle.com/javase/8/docs/api/java/nio/file/Files.html#readAllLines-java.nio.file.Path-) was added in Java SE 8, and the assumed charset is always UTF-8. – Nayuki May 16 '16 at 15:56

2 Answers2

14

[...] most of the time the method will be called with the default Charset anyway,

Not really. Most of the time it will be called with the charset that you expect the file to be encoded in. Typically these days it is UTF-8:

Files.readAllLines("fileName", StandardCharsets.UTF_8)

Your application can be executed on several platforms and operating systems, using different default character encoding. You don't want your application to break just because of that.

I think it's a good choice, fixing wrong desing decisions from the past. Many old Java methods use default system encoding, causing inconsistent behaviour or application e.g. between Windows and Linux. Forcing to choose the character encoding simply makes your application more portable and safer.


BTW since you are mentioning io.Source class - note that it returns an iterator instead of a List<String> as Files class does. The advantage: file is loaded lazily, not all at once to huge ArrayList<String>. Disadvantage: you must close the source manually (which you can't do in your code snippet).

Tomasz Nurkiewicz
  • 334,321
  • 69
  • 703
  • 674
  • +1 Really they ought to deprecate `String.getBytes()` without a charset and the like – artbristol Oct 03 '12 at 09:15
  • 1
    Well, i'd say that the charset i expect the file to be encoded in is my default charset ( which is UTF_8 in my case anyway :) ). And if UTF_8 is in any case the most reasonable options, then they could have used that option as default – Chirlo Oct 03 '12 at 09:16
  • @Chirlo: +1, that's a fair assumption. But still I believe it's a good idea to make charset explicit. – Tomasz Nurkiewicz Oct 03 '12 at 09:22
  • 1
    let's then say i would wish for a `Files.readAllLinesWithDefaultCharset(file)` method :) It'd make a difference to me at least. Btw, +1 for the tip on `io.Source`. – Chirlo Oct 03 '12 at 09:28
0

You would have to ask the designers, but very probably they share my view that reading entire files into memory is not something to be encouraged. It does not scale, and it introduces unnecessary time and space costs. Process the file a line at a time.

user207421
  • 305,947
  • 44
  • 307
  • 483