1

I have data as the following example:

| a | b |
|---|---|
| 1 | 1 |
| 1 | 0 |
| 2 | 1 |
| 2 | NA|
| 3 | 0 |
| 4 | NA|
| 4 | NA|
| 4 | NA|
| 5 | 1 |
| 5 | NA|
| 5 | 0 |
| 5 | 1 |
| 6 | 0 |

I need to create a new data frame by summing b dependent on a and if every data in a group is NA the output should be NA instead of zero, like this:

| a | b |
|---|---|
| 1 | 1 |
| 2 | 1 |
| 3 | 0 |
| 4 | NA|
| 5 | 2 |
| 6 | 0 |

How can I structure a sum in R to behave like this?

Thank you

ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
Dogener
  • 11
  • 1

2 Answers2

2

A base R option using aggregate

aggregate(. ~ a,
  df, 
  function(x) ifelse(all(is.na(x)), NA, sum(x, na.rm = TRUE)),
  na.action = na.pass
)

gives

  a  b
1 1  1
2 2  1
3 3  0
4 4 NA
5 5  2
6 6  0

data

> dput(df)
structure(list(a = c(1L, 1L, 2L, 2L, 3L, 4L, 4L, 4L, 5L, 5L, 
5L, 5L, 6L), b = c(1L, 0L, 1L, NA, 0L, NA, NA, NA, 1L, NA, 0L,
1L, 0L)), class = "data.frame", row.names = c(NA, -13L))
´``
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
1

Using mean_ from hablar

library(dplyr)
library(hablar)
df %>% 
    group_by(a) %>%
    summarise(b = sum_(b))

-output

# A tibble: 6 x 2
      a     b
  <int> <int>
1     1     1
2     2     1
3     3     0
4     4    NA
5     5     2
6     6     0

data

df <- structure(list(a = c(1L, 1L, 2L, 2L, 3L, 4L, 4L, 4L, 5L, 5L, 
5L, 5L, 6L), b = c(1L, 0L, 1L, NA, 0L, NA, NA, NA, 1L, NA, 0L,
1L, 0L)), class = "data.frame", row.names = c(NA, -13L))
akrun
  • 874,273
  • 37
  • 540
  • 662