I need to filter pandas DataFrame using where function by conditions in reference column or index(row).
It seems by column condition, it can be successuful, but it will fail by using index(row) with similiar methods.
The question is: is this an expected behavior. If so, how to apply the filter for index(row)?
import pandas as pd
import numpy as np
from pandas import Series, DataFrame
%matplotlib inline
mydict={}
cols=4
rows=4
for i in range(cols):
mydict[chr(ord('w')+i)]=np.random.randint(0,100,rows)
mydict
df=DataFrame(mydict,index=map(lambda x:chr(97+x), range(rows)))
print(df)
print("Filter all data if the column:w has even data ... WORKING")
print(df.loc[:,'w']%2==0)
print(df.where(lambda x: x.loc[:,'w']%2==0))
print("Filter all data if the index:a has even data ... NOT WORKING")
print(df.loc['a',:]%2==0)
print(df.where(lambda x: x.loc['a',:]%2==0, axis=1))
print(df.where(lambda x: x.loc['a',:]%2==0, axis=0))
pd.__version__
Result:
w x y z
a 42 98 74 51
b 69 82 70 40
c 93 7 78 45
d 22 61 70 4
Filter all data if the column:w has even data ... WORKING
a True
b False
c False
d True
Name: w, dtype: bool
w x y z
a 42.0 98.0 74.0 51.0
b NaN NaN NaN NaN
c NaN NaN NaN NaN
d 22.0 61.0 70.0 4.0
Filter all data if the index:a has even data ... NOT WORKING
w True
x True
y True
z False
Name: a, dtype: bool
w x y z
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN
d NaN NaN NaN NaN
w x y z
a NaN NaN NaN NaN
b NaN NaN NaN NaN
c NaN NaN NaN NaN
d NaN NaN NaN NaN
'0.21.1'
Reference: