I have a dataframe like this.
I want to search in the columns zip
and div
and get type
and apply that result to all unique zips. Kind of pivot function but return the actual value instead of an aggregate function.(Assume there is only one combination of zip
and div
)
df1 = pd.DataFrame(list(product(list(range(100,200)), ['A','B','C','D','E'])), columns=['zip', 'div'])
df1 = df1.drop(df1.index[np.random.randint(0,499,size=100)]).reset_index()
df1['type'] = np.random.choice(['P','Q','R'],size=df1.shape[0])
I have tried apply/lambda function but it is very slow. My data contains 500K rows in df1 with 41K unique zip
s and 15 unique div
s
Is there an efficient way to get a result like the following.
df2 = pd.DataFrame({'zip':[100,101],'A':['Q','P'],'B':['Q','Q'],'C':['Q','P'],'D':['Q','R'],'E':['Q','P']})
Assume zip is non-numeric.