How to transpose rows to columns using RDD or data frame without pivot.
SessionId,date,orig, dest, legind, nbr
1 9/20/16,abc0,xyz0,o,1
1 9/20/16,abc1,xyz1,o,2
1 9/20/16,abc2,xyz2,i,3
1 9/20/16,abc3,xyz3,i,4
So I want to generate new schema like:
SessionId,date,orig1, orig2, orig3, orig4, dest1, dest2, dest3,dest4
1,9/20/16,abc0,abc1,null, null, xyz0,xyz1, null, null
Logic is if:
nbr is 1 and legind = o then orig1 value (fetch from row 1) ...
nbr is 3 and legind = i then dest1 value (fetch from row 3)
So how to transpose the rows to columns...
Any idea will be great appreciated.
Tried with below option but its just flatten all in single row..
val keys = List("SessionId");
val selectFirstValueOfNoneGroupedColumns =
df.columns
.filterNot(keys.toSet)
.map(_ -> "first").toMap
val grouped =
df.groupBy(keys.head, keys.tail: _*)
.agg(selectFirstValueOfNoneGroupedColumns).show()