5

I have one DataFrame which contains these values :

Dept_id  |  name  | salary
 1           A       10
 2           B       100
 1           D       100
 2           C       105
 1           N       103
 2           F       102
 1           K       90
 2           E       110

I want the result in this form :

Dept_id  |  name  | salary
 1           N       103
 1           D       100
 1           K       90
 2           E       110
 2           C       105 
 2           F       102

Thanks In Advance :).

zero323
  • 322,348
  • 103
  • 959
  • 935
Learner
  • 1,170
  • 11
  • 21

1 Answers1

5

the solution is similar to Retrieve top n in each group of a DataFrame in pyspark which is in pyspark

If you do the same in scala, then it should be as below

df.withColumn("rank", rank().over(Window.partitionBy("Dept_id").orderBy($"salary".desc)))
    .filter($"rank" <= 3)
    .drop("rank")

I hope the answer is helpful

Ramesh Maharjan
  • 41,071
  • 6
  • 69
  • 97