I have a dataset that has 2 columns; column A is State_Name and has 5 different options of state, and column B is Total_Spend which has the average total spend of that state per day. There are 365 observations for each state.
What I want to do is count the number of outliers PER STATE using the 1.5 IQR rule and save the count of outliers per state to a new df or table.
So I would expect an output something like:
State | Outlier Count |
---|---|
ATL | 5 |
GA | 20 |
MI | 11 |
NY | 50 |
TX | 23 |
I have managed to get it to work by doing it one state at a time but I can't figure out what to do to achieve this in a single go.
Here is my code at the moment (to return the result for a single state):
daily_agg %>%
select(State_Name, Total_Spend) %>%
filter(State_Name == "NY")
outlier_NY <- length(boxplot.stats(outlier_df$Total_Spend)$out)
Any help would be appreciated.
Thanks!
EDIT WITH TEST DATASET
outlier_mtcars <-
df %>%
select(cyl, disp) %>%
filter(cyl == "6")
outliers <- length(boxplot.stats(outlier_mtcars$disp)$out)
The above shows me 1 outlier for 6 cyl cars but I want a table that shows how many outliers for 4, 6, 8 cyl cars