I have a dataframe like this one:
name field1 field2 field3
a 4 10 8
b 5 0 11
c 10 7 4
d 0 1 5
I need to find top 3 names for each field.
Expected output:
top3-field1 top3-field2 top3-field3
c a b
b c a
a d d
So, I tried to sort field(n) column values, limit top 3 results and generate new columns using withColumn
method, like this:
df1 = df.orderBy(f.col("field1").desc(), "name") \
.limit(3) \
.withColumn("top3-field1", df["name"]) \
.select("top3-field1", "field1")
With this approach I have to create different dataframes for each field(n), and then join them to get the result as described above. I feel that there must be better solution for this problem. Hope someone can give me suggestions