First of all, your example is a bit weird - each row is either an int and a string or an int and an optional string? This is the same thing as saying that each row is an int and an optional string, you don't need the two alternatives for that.
But for a useful example, let's say that each row is either an int and a boolean or an int and an optional float (and let's assume that you don't want to use Eiter
, \/
or Xor
to represent the disjunction):
sealed trait A
case class Alternative1(i: Int, b: Boolean) extends A
case class Alternative2(i: Int, of: Option[Float]) extends A
Using kantan.csv and its shapeless module, you can actually parse that pretty trivially:
import kantan.csv.ops._
import kantan.csv.generic._
"""1,true
2,3.14
3,""".asCsvReader[A](',', false).foreach(println _)
asCsvReader
is brought in scope by the import statement. It takes a type parameter, the type as which to decode each row, and two value parameters, the column separator and a flag indicating whether the first row should be skipped.
This code outputs:
Success(Alternative1(1,true))
Success(Alternative2(2,Some(3.14)))
Success(Alternative2(3,None))
Note that:
- the return value of
asCsvReader
is an Iterator
like structure, which means you never need to load the whole CSV in memory.
- each row is wrapped in either a
Success
or Failure
, and decoding never throws (unless you need it do, in which case you can use asUnsafeCsvReader
).