Binning values and using the binning labels to refer to the index of another dataframe

Question

I am struggling with this task:
What I did so far:I have 8760 values in which I binned them according to certain intervals. The number of intervals is 10.Then I grouped the values.

Problem:Now I have to refer each of the 'levels' of this dataframe (df1)to another dataframe's index in (df2) to perform a certain calculation row-wise.(i.e) 10 intervals pointing 10 indexes of a another dataframe.

bins=[-1,0,1,1.065,1.230,1.500,1.950,2.800,4.500,6.200,13.10]
arr=pd.cut(df1,bins)
grouped=df1.groupby(arr)
pd.value_counts(arr)


Out[58]:
(-1, 0]           4015  
(0, 1]            1948  
(1.95, 2.8]       646  
(2.8, 4.5]        542  
(1.5, 1.95]       539  
(1.23, 1.5]       427  
(1.065, 1.23]     337  
(4.5, 6.2]        127  
(1, 1.065]        125  
(6.2, 13.1]        54  
dtype: int64

Now I have to use this to refer this to a index of (df2)

data={'f11':['0','0','-0.008','0.13','0.33','0.568','0.873','1.132','1.06','0.678'],'f12':['0','0','0.588','0.683','0.487','0.187','-0.392','-1.237','-1.6','-0.327'],'f13':['0','0','-0.062','-0.151','-0.221','-0.295','-0.362','-0.412','-0.359','-0.25'],'f21':['0','0','-0.06','-0.019','0.055','0.109','0.226','0.288','0.264','0.156'],'f22':['0','0','0.072','0.066','-0.064','-0.152','-0.462','-0.823','-1.127','-1.377'],'f23':['0','0','-0.022','-0.029','-0.026','-0.014','0.001','0.056','0.131','0.251']}  

df2=DataFrame(data,columns=['f11','f12','f13','f21','f22','f23'],index=['1','2','3','4','5','6','7','8','9','10'])

Solution needed: (-1, 0] refering to index '1',(0, 1] to index '2' and so on.This is to perform (f11+f12+(f21*f22*f23)) for all the 8760 values row-wise according to the referred index.

score 0 · Answer 1 · answered Feb 25 '14 at 16:01

Map categories into integer indexes

mapping_dict = dict(zip(arr.unique(), np.arange(arr.size)))

category_as_int = pd.Series(arr).map(mapping_dict)
Add category_as_int as a column to df1

df1 = pd.DataFrame(df1) #Converts df1 to DataFrame if its a Series

df1['key'] = category_as_int
Merge df1 and df2 (Note change in index for df2)

df2 = DataFrame(data, columns=['f11','f12','f13','f21','f22','f23'], index=np.arange(len(data))

df = pd.merge(df1, df2, left_on='key', right_index=True, how='left')
Perform operation on all 8K+ rows

df.f11 + df.f12 + (df.f21 * df.f22 * df.f23)

thank you so much...made some changes to your code as per convenience and eventually ended up with the solution.. — vinoth mannan, Feb 25 '14 at 23:02

Binning values and using the binning labels to refer to the index of another dataframe

1 Answers1