2

I have a dataframe with 5 values per row. I need to calculate the mean of the 4 lowest values. So, I would like to delete the highest value in each row and then calculate the mean. For me, it does not matter in which column the highest value is. example:

df = {"A":[78, 45, 50, 85, 63],

      "B":[52, 67, 81, 65, 83],

      "C":[67, 55, 81, 62, 58]}

I would like to get a column with 59 (the mean of 78, 45, 50 and 63) for A, 66.25 for B and 60,5 for C.

Thanks for your suggestions so far! After trying them, I realized I need to add something.

I also have negative values. I want to exclude the highest deviation from zero, so the highest absolute value. But I need the mean of the true values, both positive and negative. example:

df = {"A":[-78, 45, -50, 85, -63],

      "B":[-52, 67, -81, 65, -83],

      "C":[67, -55, 81, -62, 58]}

I would like to get a column with -36.5 (the mean of -78, 45, 50 and -63) for A, -.25 for B (not using -83) and 2 for C (not using 81).

Any suggestions?

NienkeH
  • 67
  • 4

2 Answers2

1

You can take the sum and subtract the max value then divide by the number of values per column excluding the max value:

(df.sum() - df.max()) / (len(df) - 1)

Result:

A    59.00
B    66.25
C    60.50
dtype: float64
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
0

"I need to calculate the mean of the 4 lowest values." - if you mean that in each row you choose 4 lowest elem from 5 and then calculate the mean; the easiest way to do this is to sort all rows and then take the first 4 elem - or delete the las column as you prefer.