0

how fill nan values in pandas data frame ? my data like this

id       state     zone

xxx       AP        south

xxx       AP      

xxx       AP     

xxx       AP     

xxx     delhi    north

xxx     delhi    

xxx     delhi    

xxx     delhi    

xxx     delhi    

how to fill missing value in zone column based on state column which we already known that AP belongs to south only, how to fill values using pandas?

Nathan
  • 3,558
  • 1
  • 18
  • 38
  • if data is sorted sate wise you can use [pandas.DataFrame.ffill](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.ffill.html#pandas-dataframe-ffill) – Shijith Jan 21 '20 at 05:31

2 Answers2

1

I think you need:

df = df.sort_values(by="state").ffill()
print(df)
Sociopath
  • 13,068
  • 19
  • 47
  • 75
0
  • First sort the values so that Nan should be in the last
  • Then group by columns I have used (id,state) to fill zone
df = pd.DataFrame(data={"id":["x","x","x","x"],
                   "state":["AP","Delhi","AP","Delhi"],
                   "zone":["sount","north",np.nan,np.nan]})

res = df.sort_values(['id','state','zone'])
res = df.groupby(['id','state'],as_index=False)['zone'].ffill()
print(res)
   id  state   zone
0  x     AP  sount
1  x  Delhi  north
2  x     AP  sount
3  x  Delhi  north
  • 2nd anwer if you want to use only state as a group
df['zone'] = df.groupby(['state'],as_index=False)['zone'].transform(lambda x:x.ffill())
print(df)
  id  state   zone
0  x     AP  sount
1  x  Delhi  north
2  x     AP  sount
3  x  Delhi  north
tawab_shakeel
  • 3,701
  • 10
  • 26