I have a TypedTipe[(String, String, Long)]
where the first String can assume only a limited (~10) number of values. I'd like to partition my output so that a folder is created for each type (I.E. 10 folders with the name of the first String). This is simple to achieve in Hive, however I cannot find an elegant way to do it in Scalding. The method def partition(p: T => Boolean): (TypedPipe[T], TypedPipe[T])
breaks the pipe in 2 parts but does not do what I'm looking for.
EDIT
- I am using Scalding
v0.13.1
- I need to write a
PackedAvroSource