0

I am trying to figure out the best way to consolidate my data frame and I seem to be hitting roadblock. How can I combine these two rows so that Michael's trial and purchase date sit on the same row?

         **user**    | **trial_date**  | **purchase_date**
         Michael     |   01-02-2016    |      NA 
         Michael     |      NA         |   02-15-2016
msidd2016
  • 1
  • 1

2 Answers2

2

You can use the spread and gather function from tidyr to get rid of the NAs, by first gathering the two columns into one column, then filtering the NAs in the combined data column, and then spreading them back out.

df %>%
  group_by(user) %>% 
  gather("type", "date", trial_date, purchase_date) %>% 
  filter(!is.na(date)) %>%
  spread(type, date)


#      user purchase_date trial_date
# *  <fctr>         <chr>      <chr>
# 1 Michael    02-15-2016 02-15-2016
yeedle
  • 4,918
  • 1
  • 22
  • 22
0

You can take the first non NA values from each column after grouping the data frame by user, if all elements are NAs, .[!is.na(.)] returns NULL which is coerced to NA with [1] indexing:

df %>% group_by(user) %>% summarise_all(funs(.[!is.na(.)][1]))

# A tibble: 1 × 3
#     user trial_date purchase_date
#   <fctr>     <fctr>        <fctr>
#1 Michael 01-02-2016    02-15-2016
Psidom
  • 209,562
  • 33
  • 339
  • 356