Formatting grouped data for tables in R

Question

I'm trying to display my data in table format and I can't figure out how to rearrange my data to display it in the proper format. I'm used to wrangling data for plots, but I'm finding myself a little lost when it comes to preparing tables. This seems like something really basic, but I haven't been able to find an explanation on what I'm doing wrong here.

I have 3 columns of data, Type, Year, and n. The data formatted as it is now produces a table that looks like this:

Type    Year    n
Type C  1   5596
Type D  1   1119
Type E  1   116
Type A  1   402
Type F  1   1614
Type B  1   105
Type C  2   26339
Type D  2   14130
Type E  2   98
Type A  2   3176
Type F  2   3071
Type B  2   88

What I want to do is to have Type as row names, Year as column names, and n populating the table contents like this:

         1      2        
Type A   402    3176   
Type B   105    88
Type C   26339  5596
Type D   1119   14130
Type E   116    98
Type F   1614   3071

The mistake might have been made upstream from this point. Using the full original data set I arrived at this output by doing the following:

exampletable <- df %>%
  group_by(Year) %>%
  count(Type) %>%
  select(Type, Year, n)

Here is the dput() output

structure(list(Type = c("Type C", "Type D", "Type E", "Type A", 
"Type F", "Type B", "Type C", "Type D", "Type E", "Type A", "Type F", 
"Type B", "Type C", "Type D", "Type E", "Type A", "Type F", "Type B", 
"Type C", "Type D", "Type E", "Type A", "Type F", "Type B", "Type C", 
"Type D", "Type E"), Year = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 
2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5), n = c(5596, 
1119, 116, 402, 1614, 105, 26339, 14130, 98, 3176, 3071, 88, 
40958, 17578, 104, 3904, 3170, 102, 33145, 23800, 93, 1264, 7084, 
1262, 34642, 24911, 504)), class = c("spec_tbl_df", "tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -27L), spec = structure(list(
    cols = list(Type = structure(list(), class = c("collector_character", 
    "collector")), Year = structure(list(), class = c("collector_double", 
    "collector")), n = structure(list(), class = c("collector_double", 
    "collector"))), default = structure(list(), class = c("collector_guess", 
    "collector")), skip = 1), class = "col_spec"))

score 1 · Accepted Answer · answered May 13 '20 at 06:29

1

You can get the data in wide format and change Type column to rowname.

tidyr::pivot_wider(df, names_from = Year, values_from = n) %>%
   tibble::column_to_rownames('Type')

#          1     2     3     4     5
#Type C 5596 26339 40958 33145 34642
#Type D 1119 14130 17578 23800 24911
#Type E  116    98   104    93   504
#Type A  402  3176  3904  1264    NA
#Type F 1614  3071  3170  7084    NA
#Type B  105    88   102  1262    NA

answered May 13 '20 at 06:29

Ronak Shah

377,200
20
156
213

That did the trick. I had a feeling it would be something simple. I'm trying to add a row on the end that would be the total for each year. I tried adding the data before using pivot_wider, but I ended up with an additional column along with the years. Not sure how to add it to the bottom of the types. Any ideas? – Corey May 13 '20 at 07:59
1

If you store the output of the above in `df1`, you can use `colSums`, `rbind(df1, colSums(df1, na.rm = TRUE))` – Ronak Shah May 13 '20 at 08:13

score 0 · Answer 2 · answered May 13 '20 at 07:15

You can use tidyr package to get to wider format and tibble package to convert a column to rownames

dataset <- read.csv(file_location)
dataset <- tidyr::pivot_wider(dataset, names_from = Year, values_from = n)

tibble::column_to_rownames(dataset, var = 'Type')
       1     2
Type C 5596 26339
Type D 1119 14130
Type E  116    98
Type A  402  3176
Type F 1614  3071
Type B  105    88

Formatting grouped data for tables in R

2 Answers2