So i wanted to find the weight average of grades with thick as the weight. Problem is, i need to do it everytime the Grade shows >0 consecutively, and if it's 0 consecutively, i would still like to find it's weight average although its 0 for sure.
Below is the tables including the results that i wanted to find. As you can see there is several empty values / NaN on Total Thickness and WA (Weight Average). Each values above empty values is to represent the each group's WA that i need to find
Basically, i need to groupby based on it's grade (either it's all 0 or it's all >0) then i need to find the weight average of each of those groups.
BHID Thick Total Thickness Grade WA
01/01/1970 2 2.5 0 0
01/01/1970 0.5 0
01/01/1970 0.5 2.5 3.35 0.71
01/01/1970 2 0.05
01/01/1970 0.8 1.5 0 0
01/01/1970 0.7 0
01/02/1970 2 4 3.3 4.075
01/02/1970 0.5 2.8
01/02/1970 0.5 0
01/02/1970 1 8.3
01/02/1970 1.7 2.3 0 0
01/02/1970 0.6 0
note: you may notice on 01/02/1970, there is a single 0 on it's Grade but it is not consecutive so it doesn't count as a new group. So there is only 5 groups with 5 WA values (0, 0.71, 0, 4.075, 0)
I've tried to groupby using BHID, and managed to find wavg (weighted average) for each BHID, pretty good
wavg
BHID
1970-01-01 0.273077
1970-02-01 2.587302
but when i tried to groupby Grades, it surely just groupby unique grade values which is not what i was looking for. I was trying to groupby by Grade's ranges.
wavg
BHID Grade
1970-01-01 0.00 0.00
3.35 3.35
0.05 0.05
1970-02-01 3.30 3.30
2.80 2.80
0.00 0.00
8.30 8.30
here is the code that i used
def my_agg(x):
names = np.average(x['Grade'],weights=x['Thick'],axis=0)
return pd.Series(names,index=['wavg'])
print(df1.groupby(['BHID','Grade'],sort=False,).apply(my_agg))
i just want to know how i can find the weight average from groupby based on Grade ranges.