0

I know there are various libraries around to read CSV in scala. I have tried the shapeless way , but I am having trouble reading csv in generic way for a hierarchy . For e.g. I need something like this :

abstract class A
case class ChildOneOfA(i:Int,s:String) extends A
case class ChildTwoOfA(i:Int,os:Option[String]) extends A`



//  Requires generic implementation of T which is subtype of A

def genericCSVReader[T]:GenericCsvRecordReader[T] = {
//Generic implementation to return csv record iterator/reader
}
Harsh Gupta
  • 339
  • 4
  • 20
  • Can you say how you would represent this ADT as CSV? – Miles Sabin May 24 '16 at 16:40
  • The intention is csv would adhere to the type parameter of case classes. Like for `ChildOneOfA(i:Int,s:String)` I will have `"1,HI \n 2,HELLO"` or for `ChildOneOfA(i:Int,s:Option[String])` I can have `1, \n 2,"HEY"` – Harsh Gupta May 24 '16 at 16:43
  • What if the individual case classes had different numbers and types of elements (eg. `Boolean` vs. `Double`)? Or if you had different case classes with the same number and types of elements? – Miles Sabin May 24 '16 at 16:54
  • Yes that is an absolute possibility. The number and type will vary – Harsh Gupta May 24 '16 at 16:56
  • Right, but could you give an example of the sort of representation you envisage in these cases? – Miles Sabin May 24 '16 at 17:11
  • Well let me put it this way , the case class could have upto 22 parameters and they could be of any primitive type and any parameters out of it could be optional or mandatory one eg here is : `case class Country(id:Long,code:String,name:String,continent:String,wikipedia_link:Option[String],keywords:Option[String])` – Harsh Gupta May 24 '16 at 21:14

1 Answers1

2

First of all, your example is a bit weird - each row is either an int and a string or an int and an optional string? This is the same thing as saying that each row is an int and an optional string, you don't need the two alternatives for that.

But for a useful example, let's say that each row is either an int and a boolean or an int and an optional float (and let's assume that you don't want to use Eiter, \/ or Xor to represent the disjunction):

sealed trait A
case class Alternative1(i: Int, b: Boolean) extends A
case class Alternative2(i: Int, of: Option[Float]) extends A

Using kantan.csv and its shapeless module, you can actually parse that pretty trivially:

import kantan.csv.ops._
import kantan.csv.generic._

"""1,true
2,3.14
3,""".asCsvReader[A](',', false).foreach(println _)

asCsvReader is brought in scope by the import statement. It takes a type parameter, the type as which to decode each row, and two value parameters, the column separator and a flag indicating whether the first row should be skipped.

This code outputs:

Success(Alternative1(1,true))
Success(Alternative2(2,Some(3.14)))
Success(Alternative2(3,None))

Note that:

  • the return value of asCsvReader is an Iterator like structure, which means you never need to load the whole CSV in memory.
  • each row is wrapped in either a Success or Failure, and decoding never throws (unless you need it do, in which case you can use asUnsafeCsvReader).
Nicolas Rinaudo
  • 6,068
  • 28
  • 41
  • For the clarification of the variety of case classes expected I commented under the question. Secondly I tried your library with `//passing seperator and header dynamically in my program new File(fileName).asCsvReader[T](seperator, header = header)` I get `could not find implicit value for evidence parameter of type kantan.csv.RowDecoder[T]` – Harsh Gupta May 24 '16 at 21:20
  • I think I see the problem . It doesn't work with the abstract class.It works with the traits alright. – Harsh Gupta May 24 '16 at 22:46
  • It's not the difference between trait and abstract, but the fact that it must be sealed. – Nicolas Rinaudo May 25 '16 at 04:54