-1

I have a dataframe which has say 3 columns x,y and z. I want to get all the three columns in result but I do not want to cube on column z.

Is there a way I can do it?

P.S. - (I have just given example with 3 columns but I have quite a long list of columns so GROUP SET is not an option).

Example -

val df = Seq(("1","x","a"),("1","v","b"),("3","x","c")).toDF("col1","col2","col3")

val list = Seq("col1","col2").map(e=>col(e))

// now I want to select col3 non cubed (basically I do not want get the combinations for it) // This guy will not select col3 at all since col3 is not part of cube which is I want to achieve

display(df.select($"col1",$"col2",$"col3").cube(list:_*).agg(sum("col1")))

Abhishek
  • 235
  • 3
  • 11
  • Can you add some example what you want to achive and what do you mean with cube? Did you try anything? – Shaido May 17 '18 at 08:38

1 Answers1

0

Cube is an extension of GroupBY in which you will get the aggregated result for the various combinations of columns used to group by. Here is an example of what you can achieve using groupBy,

df.cube($"col1",$"col2").agg(first($"col3").as("col3")).show

Please share your expected result as suggested by Shaido.

Sc0rpion
  • 73
  • 1
  • 5