here's a pretty basic question, but my brain is giving up on me and I would really appreciate some help.
I have a dataset with 10000 rows. I have an area name column with 100 unique area names. I have a type column with types ranging from 1 to 10. And I have a spend column.
I would like to group it by area name, and add a new column with an average spend per name (or even in the old spend column).
However: I only want the average of the types from 1-7. So I want to exclude any types 8, 9 or 10 that are in that are in the area. Except, if an area contains only types 8, 9 or 10. In that case, I want the average of that spend.
What I've played with, but haven't managed to actually do it:
Approach 1: Create 2 datasets, one with types1-7, another where there's only types 8, 9 or 10 in an area:
main=['1.','2.', '3.','4.', '5.', '6.', '7.']
eight_to_ten=['8.', '9.', '10.']
df_main = df[df['Type'].isin(main)]
df_main['avg_sales'] = df_main.groupby(['Area Name'])['Sales'].mean()
Approach 2:
df_new['avg_sales'] = df[df['Type'].isin(main)].groupby('Area Name')['Sales'].mean()
I assume there is a really short way of doing this, most likely without having to split the dataset into 2 and then concat it back.
Is it easier to do it with a for loop?
Any help would be appreciated