I'm trying to read a ctrl-a delimited file in scalding. I'm getting an error that says it found the wrong number of fields (expecting 166, found 142) and then it displays the line it is trying to read. For some reason, it does not read the delimiter in the 1st third of the file. Here is the code I am using:
Csv(args("input"), separator = "\u0001", fields = schema)
.read
.groupBy('var2){group => group.sum[Long]('var3)}
.write(Tsv(args("output")))
I'm new to scalding so maybe I am using the CSV function incorrectly/inappropriately. Any ideas on whhy that might be happening?