5

I am trying to parse a csv file where some of the lines may have missing fields, and I found this strange behavior:

scala> val s = "1,2,,,"
s: String = 1,2,,,

scala> s.split(",")
res4: Array[String] = Array(1, 2)

While I am expecting the result to be Array(1,2,"","",""). Am I missing something? If not, what is the justification of this behavior?

Psidom
  • 209,562
  • 33
  • 339
  • 356

1 Answers1

3

That behavior was inherited from Java. Also inherited, but not fully documented, is the Java alternative split() method.

scala> val s = "1,2,,,"
s: String = 1,2,,,

scala> s.split(",", -1)
res0: Array[String] = Array(1, 2, "", "", "")
jwvh
  • 50,871
  • 7
  • 38
  • 64
  • Thanks for the reply. That makes a lot of sense. According to the documentation, *If n is non-positive then the pattern will be applied as many times as possible and the array can have any length*. When I try the `split` method with different negative limit number, it gives me the same output. Does this mean all negative numbers as limit behave in the same way regardless of the actual value? – Psidom May 07 '17 at 02:08
  • 1
    Yes. That's what I understand the docs to mean and, in my (very limited) experience, I've not found any different behavior for different negative numbers. – jwvh May 07 '17 at 02:14