0

I have this dataset saved as a matrix named s2

       [,1]
0         4
0.5       1
1         6
10       61
15       28
2         8
20       25
23        1
25        4
3         3
30       44

I wanted to group them by the names of their rows to get this for example

         [,1]
0-10      22
10-20     89
20-30     30
30-40     48

I was wondering if there were another way (faster since my dataset is much bigger than this) than s[1,]=s[1,]+s2[2,]+.. and then deleting all rows ? I tried with aggregate , and I read something about wordStem() but I couldn't get anywhere. Thank you

HHH
  • 23
  • 5
  • You'll want to convert `s2` to a `data.frame` and use your favorite data munging tool (`data.table` or `dplyr`) to performing a grouping operation. Or, curse the idea, use `tapply` – MichaelChirico Aug 01 '18 at 11:45
  • `df %>% mutate(var1 = floor(abs(df$var1) / 10)) %>% group_by(var1) %>% summarise(var2 = sum(var2))` – Andre Elrico Aug 01 '18 at 11:53

1 Answers1

1

Assuming your matrix is m you can do this:

library(tidyverse)

# specify your breaks for the grouping
brks = c(0,10,20,30,40)

data.frame(m) %>%                              # create a dataframe from your matrix
  rownames_to_column() %>%                     # add rownames as a column
  mutate(rowname = as.numeric(rowname)) %>%    # make that column numeric (in order to group)
  group_by(group = cut(rowname, breaks = brks, right = F)) %>%  # use your breaks to group 
  summarise(m = sum(m)) %>%                    # get the sum of values for each group
  data.frame() %>%                             # create a dataframe from tibble (in order to have rownames)
  column_to_rownames("group")                  # add rownames from your group column

#          m
# [0,10)  22
# [10,20) 89
# [20,30) 30
# [30,40) 44
AntoniosK
  • 15,991
  • 2
  • 19
  • 32
  • thank you, it was very helpful ! I just needed to changed to summarise(m=sum(group)) – HHH Aug 01 '18 at 13:28