Adding totals to a data frame

Question

I would like to add totals to my data frame but am having difficulties because the data is quite messy (as ever!) - some columns are text, some dates, some numeric. I can't post the actual data as it is sensitive but I will show a representative example instead which is same structure (below - needed columns are in yellow. I have been trying to do this with dplyr and pipes but run into problems due to mix of text and numbers....

data:

date <- c("17/08/2020", "17/08/2020", "17/08/2020", "17/08/2020","18/08/2020", "18/08/2020", "18/08/2020", "18/08/2020")

type <- c("type A", "type B", "type A", "type B","type A", "type B","type A", "type B")

location <- c("USA","USA","India","India","USA","USA","India","India")

value <- c("10","10","frak","frak","15","15","open","open")

df <- data.frame(date, type, location, value)

Basically, I need total summarised by date, by type and by location. enter image description here

Not sure what you want exactly since the image you provide and your description differ about the grouping. Should "frak" and "open" be filtered out , should they be NA or should they be in the summarized data? — Humpelstielzchen, Aug 17 '20 at 11:13
the text data such as frak and open cannot be summed, so "n/a" or blank is fine for those.... all the numeric data I want to sum to give the yellow rows in my image — R_debutante, Aug 17 '20 at 11:40

Humpelstielzchen · Answer 1 · 2020-08-17T12:11:25.187

Not sure if that's what you're after.

df %>%
  group_by(date, type = "total_type", location) %>%
  summarise("value" = sum(as.numeric(value), na.rm = F)) %>%
  mutate(value = as.character(value)) %>%
  bind_rows(df)

# A tibble: 12 x 4
# Groups:   date, type [6]
   date       type       location value
   <chr>      <chr>      <chr>    <chr>
 1 17/08/2020 total_type India    NA   
 2 17/08/2020 total_type USA      20   
 3 18/08/2020 total_type India    NA   
 4 18/08/2020 total_type USA      30   
 5 17/08/2020 type A     USA      10   
 6 17/08/2020 type B     USA      10   
 7 17/08/2020 type A     India    frak 
 8 17/08/2020 type B     India    frak 
 9 18/08/2020 type A     USA      15   
10 18/08/2020 type B     USA      15   
11 18/08/2020 type A     India    open 
12 18/08/2020 type B     India    open

Grouping by all columns but value reproduces your original table and in your image summarised rows have type = total_type. On the other hand all your summarised rows in the image have location USA which doesn't make sense either, so I just let that as it is.

This looks good, thanks. Although I would like the new rows added to the original data frame? — R_debutante, Aug 17 '20 at 12:05
write the `group_by` statement this way: `group_by(date, type = "total_type", location) %>%` and then use `bind_rows` to bind `df` to the dataframe with the totals — Edo, Aug 17 '20 at 12:08

Duck · Answer 2 · 2020-08-18T14:02:26.830

0

I would suggest next approach, which is also close to the one proposed by @Humpelstielzchen, that is close to what you show in the image:

library(dplyr)

df %>% bind_rows(df %>% group_by(date,location) %>%
                   mutate(value=as.numeric(value)) %>% 
                   summarise(value=sum(value,na.rm=F)) %>%
                   mutate(type='total type',value=as.character(value)))

Output:

         date       type location value
1  17/08/2020     type A      USA    10
2  17/08/2020     type B      USA    10
3  17/08/2020     type A    India  frak
4  17/08/2020     type B    India  frak
5  18/08/2020     type A      USA    15
6  18/08/2020     type B      USA    15
7  18/08/2020     type A    India  open
8  18/08/2020     type B    India  open
9  17/08/2020 total type    India  <NA>
10 17/08/2020 total type      USA    20
11 18/08/2020 total type    India  <NA>
12 18/08/2020 total type      USA    30

Update: Here an approach that could works because of OP'issues with version of package:

library(dplyr)
#Data
date <- c("17/08/2020", "17/08/2020", "17/08/2020", "17/08/2020","18/08/2020", "18/08/2020", "18/08/2020", "18/08/2020")

type <- c("type A", "type B", "type A", "type B","type A", "type B","type A", "type B")

location <- c("USA","USA","India","India","USA","USA","India","India")

value <- c("10","10","frak","frak","15","15","open","open")

df <- data.frame(date, type, location, value,stringsAsFactors = F)
#Mutate for summary
df1 <- df %>% group_by(date,location) %>%
  mutate(value=as.numeric(value)) %>% 
  summarise(value=sum(value,na.rm=F)) %>%
  mutate(type='total type') %>% ungroup()
df1$value <- as.character(df1$value)
#Bind
df2 <- rbind(df,df1)

Output:

         date       type location value
1  17/08/2020     type A      USA    10
2  17/08/2020     type B      USA    10
3  17/08/2020     type A    India  frak
4  17/08/2020     type B    India  frak
5  18/08/2020     type A      USA    15
6  18/08/2020     type B      USA    15
7  18/08/2020     type A    India  open
8  18/08/2020     type B    India  open
9  17/08/2020 total type    India  <NA>
10 17/08/2020 total type      USA    20
11 18/08/2020 total type    India  <NA>
12 18/08/2020 total type      USA    30

edited Aug 18 '20 at 14:02

answered Aug 17 '20 at 12:05

Duck

39,058
13
42
84

This looks perfect but I get an error - Error: Column `value` can't be converted from numeric to character – R_debutante Aug 17 '20 at 12:40
@R_debutante That is strange could you try `df %>% bind_rows(df %>% group_by(date,location) %>% mutate(value=as.numeric(value)) %>% summarise(value=sum(value,na.rm=F)) %>% ungroup() %>% mutate(type='total type',value=as.character(value)))` – Duck Aug 17 '20 at 12:46
Hmmm. Yes, I still get the same error message? Error: Column 'value' can't be converted from numeric to character – R_debutante Aug 17 '20 at 13:40
@R_debutante Could you please re start `R` and see if the issue persists? I believe is an issue of packages or maybe update `dplyr` to last version! – Duck Aug 17 '20 at 13:49
I am running dplyr 0.8, think this is the issue - I cannot upgrade though as is workplace server... – R_debutante Aug 18 '20 at 12:45
@R_debutante Could you please try: `df %>% bind_rows(df %>% group_by(date,location) %>% mutate(value=as.numeric(value)) %>% summarise(value=sum(value,na.rm=F)) %>% mutate(type='total type') %>% mutate_all(.funs = as.character))` – Duck Aug 18 '20 at 12:50
Hmm, that seems to generate an extra column called 'type'. Also the 'name' column now populated with 'name' and value' and the correct names are gone? – R_debutante Aug 18 '20 at 13:52
I am trying a clunkier way... 2 df's separately then rbind. However, my 2nd df is grouped class: 1] "grouped_df" "tbl_df" "tbl" "data.frame" so when I bind them it comes out a mess, THE OUTPUT LOOKS LIKE THIS: names Date name value df Character,207176 Character,207176 Character,207176 Character,207176 df2 Character,164 Character,164 Character,164 Numeric,164 – R_debutante Aug 18 '20 at 13:55
R is way harder than I thought! :-) – R_debutante Aug 18 '20 at 13:56
@R_debutante I am solving the issue now, just a moment please :) – Duck Aug 18 '20 at 13:57
@R_debutante I have added an update, could you please check and see if that works for you? – Duck Aug 18 '20 at 14:02
Thank you - still not working somehow.... I changed track and doing it little differently. Thank you though, appreciate the help – R_debutante Aug 19 '20 at 08:22
@R_debutante Oh sorry I think is because of package version. If you believe this answer helped you in any sense you could upvote. It is up to you and thanks for your time! – Duck Aug 19 '20 at 12:06

Adding totals to a data frame

2 Answers2