I come across problems when trying to make rowSums in dplyr.
After grouping the data via
data <- data %>%
group_by(location, category) %>%
summarise(amount = sum(amount)) %>%
spread(key = "category", value = "amount", fill = 0)
The output is:
# A tibble: 4,211 x 140
# Groups: location [4,211]
location art books cars
* <chr> <dbl> <dbl> <dbl>
1 New York, NY 0 10 0
2 Los Angeles, CA 12 0 2
...
Then trying to make the rowSum didn't work:
data %>% mutate(sum=rowSums(.))
Error in mutate_impl(.data, dots) :
Evaluation error: 'x' must be numeric.
> class(ks)
[1] "grouped_df" "tbl_df" "tbl" "data.frame"
I tried to change the pivot like below, but it didn't help either:
data <- data %>%
group_by(location, category) %>%
summarise(amount = as.numeric(sum(amount))) %>% # Changed
spread(key = "category", value = "amount", fill = 0)
str(data.frame(data))
'data.frame': 4211 obs. of 140 variables:
$ location : chr "New York, NY" "Los Angeles, CA" ... ...
$ art : num 0 0 0 0 0 0 0 0 0 0 ...
$ books : num 0 0 0 0 0 0 0 0 0 0 ...
$ cars : num 0 0 0 0 0 0 0 0 0 0 ...
...
It would be great to have some help here.
After calculating the sum of each row, I need to filter locations that have a rowsum < 1000. It would also be great to know how to do this and if dplyr
is the right approach in general.