How to combine rows with the same identifier R?

Question

I have been searching a lot but I can't seem to find an answer for what I'm looking for. The rows were originally melted together and then I spread them and now I have a data frame that look similar to this:

Here is the dput:

structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L), 
    `first name` = c("Jamie", NA, NA, NA, NA, "sandra", NA, NA, 
    NA, NA), `last name` = c(NA, "Johns", NA, NA, NA, NA, NA, 
    "chan", NA, NA), q1_ans = c(NA, NA, "yes", NA, NA, NA, "yes", 
    NA, NA, NA), q2_ans = c(NA, NA, NA, "no", NA, NA, NA, NA, 
    "yes", NA), q3_ans = c(NA, NA, NA, NA, "yes", NA, NA, NA, 
    NA, "no")), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"), spec = structure(list(cols = list(ID = structure(list(), class = c("collector_integer", 
"collector")), `first name` = structure(list(), class = c("collector_character", 
"collector")), `last name` = structure(list(), class = c("collector_character", 
"collector")), q1_ans = structure(list(), class = c("collector_character", 
"collector")), q2_ans = structure(list(), class = c("collector_character", 
"collector")), q3_ans = structure(list(), class = c("collector_character", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector"))), class = "col_spec"))

The real dataframe I have has much more rows and a few more columns. I want to combine them so that everything for ID 1 is on one line, and ID 2 in one row, and so on. I've tried this but it hasn't gotten me anywhere

qr <- qr %>% 
  group_by(., ID) %>%
  rowwise() %>%
  summarise_all(funs(first(na.omit(.))))

I get the error:

Error in summarise_impl(.data, dots) : 
  Column `first name` must be length 1 (a summary value), not 0

I also tried dcast but that didn't help either. Thanks!

Please use `dput` to show the example data and an expected output. It is not clear whether the elements are blank (`""`) or `NA`s. If it is `NA`, then `qr %>% group_by(ID) %>% summarise_all(na.omit)` should work if there are only one non-NA per each group — akrun, Jul 25 '18 at 15:51
Can you please check the dput is correct as I get error by copy/pasting? or the libraries loaded — akrun, Jul 25 '18 at 16:04
Can you show a dput example where it fails so that the code can be tested? — akrun, Jul 25 '18 at 16:36

score 5 · Accepted Answer · answered Jul 25 '18 at 16:15

We don't need the rowwise. After grouping by 'ID', use na.omit inside the summarise_all (assuming that there is only a single non-NA element within in 'ID' for each of the columns

qr %>%
    group_by(ID) %>%
    summarise_all(na.omit)
# A tibble: 2 x 6
#     ID `first name` `last name` q1_ans q2_ans q3_ans
#  <int> <chr>        <chr>       <chr>  <chr>  <chr> 
#1     1 Jamie        Johns       yes    no     yes   
#2     2 sandra       chan        yes    yes    no

If there are multiple non-NA elements for column per 'ID', then either create a string by concatenating all the non-NA elements

qr %>%
    group_by(ID) %>%
    summarise_all(funs(toString(na.omit(.))))

or create a list and then do unnest

qr %>%
   group_by(ID) %>%
   summarise_all(funs(list(na.omit(.))))

for the first one, I'm still getting the same error. when I do the second one, the first name and last name column are blank. With the third one, only the variable type shows up in each column — mjoy, Jul 25 '18 at 16:32
@mjoy If it is the `toString`, it is returning all the names, while with `list`, it is a `list` column, you need to extract the `list`. The output is based on your example and as I mentioned, it works when there is exactly one non-NA element per group — akrun, Jul 25 '18 at 16:33

How to combine rows with the same identifier R?

1 Answers1

Linked

Related