I am working with a data frame that has 20 ids and for each Id, there are about 10-15 stores, and each store is assigned a status (Zero, Negative and Positive).
Data:
data =
ID STORE STATUS
100001 1 zero
100001 2 positive
100001 3 zero
100001 4 negative
100001 5 zero
100001 6 zero
100001 7 positive
100001 8 negative
....
Problem:
I would like to find out what percentage of stores are zeros/ positive/negative for each id.
My attempt:
zero = 0
for item in data['ID'].unique():
items = data[data['ID']==item]
lenght = len(items)
print(lenght)
for index, i in items.iterrows():
if i['STATUS'] == 'Zero':
zero += 1
pct = zero/lenght
items['PCT'] = zero/lenght
print(items)
My result from code above:
ID STORE STATUS. PCT
100001 1 zero 0.5
100001 2 positive 0.5
100001 3 zero 0.5
100001 4 negative 0.5
100001 5 zero 0.5
100001 6 zero 0.5
100001 7 positive 0.5
100001 8 negative 0.5
Expected result :
ID STORE STATUS. PCT
100001 1 zero 0.5
100001 2 positive 0.25
100001 3 zero 0.5
100001 4 negative 0.25
100001 5 zero 0.5
100001 6 zero 0.5
100001 7 positive 0.25
100001 8 negative 0.25