0

I'm trying to find the three smallest values for each row of a dataframe, and put them in a separate dataframe. I don't need to know which column they came from, but I do need to cycle through m rows where m might change for each dataframe I use.

I wanted to use heaps.nsmallest, but I'm not sure how to loop through each row and add the results to a new line of a dataframe each time. I seem to just get a single line of results as output.

 for x in range(len(df1)):
        heap=pd.DataFrame(heapq.nsmallest(3, df1[x]))

I expected this to loop through values of x, but it only produces one column with len(df1) rows. I think it's overwriting the previous results, as it always gives the three minimum values from the last row.

Tom
  • 109
  • 1
  • 9

1 Answers1

1
df2 = pd.DataFrame([heapq.nsmallest(3, df1[x])
                   for x in range(len(df1))])

or, use sorted(df1[x])[:3]

J_H
  • 17,926
  • 4
  • 24
  • 44