How to get the N nearest entries to the median in a Pandas series?

Question

For a Pandas Series:

ser = pd.Series([i**2 for i in range(9)])
print(ser)
0     0
1     1
2     4
3     9
4    16
5    25
6    36
7    49
8    64
dtype: int64

The median can be grabbed with ser.median(), which returns 16. How can the N entries around the median be grabbed? Something like:

print(ser.get_median_entries(3)) # N == 3; not real functionality
3     9
4    16
5    25
dtype: int64

what if `n==2` ? would you take 9 16 or 16 25 ? – Albin Paul Feb 07 '18 at 15:20 — Albin Paul, Feb 07 '18 at 15:20

score 3 · Accepted Answer · answered Feb 07 '18 at 15:21

You can find the abs difference between each value and the median and use sort_values():

ser[abs(ser - ser.median()).sort_values()[0:3].index]
#4    16
#3     9
#5    25
#dtype: int64

If you want it as a function, where n is an input variable:

def get_n_closest_to_median(ser, n):
    return ser[abs(ser - ser.median()).sort_values()[0:n].index]

print get_n_closest_to_median(ser, 3)
#4    16
#3     9
#5    25
#dtype: int64

You will probably have to add some error checking on the bounds.

score 0 · Answer 2 · answered Feb 07 '18 at 15:29

logic for your problem, you can implement this logic according to your problem.

data={j:i**2 for j,i in enumerate(range(0,9))}
median=16

def nearby_values(data,median,depth):
    #subtract each value from median and then slice only three from sorted
    return list(map(lambda x:x[1],sorted([(abs(median-j),j) for i,j in data.items()])[:depth]))
print(nearby_values(data,median,3))

output:

[16, 9, 25]

How to get the N nearest entries to the median in a Pandas series?

2 Answers2