4

Consider the following two tibbles:

library(tidyverse)
a <- tibble(time = c(-1, 0), value = c(100, 200))
b <- tibble(id = rep(letters[1:2], each = 3), time = rep(1:3, 2), value = 1:6)

So a and b have the same columns and b has an additional column called id. I want to do the following: group b by id and then add tibble a on top of each group. So the output should look like this:

# A tibble: 10 x 3
   id     time value
   <chr> <int> <int>
 1 a        -1     100
 2 a         0     200
 3 a         1     1
 4 a         2     2
 5 a         3     3
 6 b        -1     100
 7 b         0     200
 8 b         1     4
 9 b         2     5
10 b         3     6

Of course there are multiple workarounds to achieve this (like loops for example). But in my case I have a large number of IDs and a very large number of columns. I would be thankful if anyone could point me towards the direction of a solution within the tidyverse.

Thank you

Cettt
  • 11,460
  • 7
  • 35
  • 58

5 Answers5

4

We can expand the data frame a with id from b and then bind_rows them together.

library(tidyverse)

a2 <- expand(a, id = b$id, nesting(time, value))
b2 <- bind_rows(a2, b) %>% arrange(id, time)
b2
# # A tibble: 10 x 3
#    id     time value
#    <chr> <dbl> <dbl>
#  1 a        -1   100
#  2 a         0   200
#  3 a         1     1
#  4 a         2     2
#  5 a         3     3
#  6 b        -1   100
#  7 b         0   200
#  8 b         1     4
#  9 b         2     5
# 10 b         3     6
www
  • 38,575
  • 12
  • 48
  • 84
  • Thank you for your answer. I wasn't aware of the expand or nesting functions. In my case the tibble a has a lot of columns. Is it possible to select all of them (inside the nesting function) without having to type their names one by one? I tried the function everything(), but it didnt work – Cettt Jul 11 '18 at 13:42
  • 1
    @Cettt Change to `a2 <- expand(a, id = b$id, nesting_(names(a)))` as a quick fix. ` – www Jul 11 '18 at 14:41
  • awesome, thank you. Can you maybe explain the difference between *nesting* and *nesting_*? – Cettt Jul 11 '18 at 14:49
  • I cannot explain very well. Please type `?nesting` and `?nesting_`. – www Jul 11 '18 at 14:50
2

split from base R will divide a data frame into a list of subsets based on an index.

b %>%
  split(b[["id"]]) %>%
  lapply(bind_rows, a) %>%
  lapply(select, -"id") %>%
  bind_rows(.id = "id")
# # A tibble: 10 x 3
#    id     time value
#    <chr> <dbl> <dbl>
#  1 a         1     1
#  2 a         2     2
#  3 a         3     3
#  4 a        -1   100
#  5 a         0   200
#  6 b         1     4
#  7 b         2     5
#  8 b         3     6
#  9 b        -1   100
# 10 b         0   200
Nathan Werth
  • 5,093
  • 18
  • 25
1

An idea (via base R) is to split your data frame and create a new one with id + the other data frame and rbind, i.e.

df = do.call(rbind, lapply(split(b, b$id), function(i)rbind(data.frame(id = i$id[1], a), i)))

which gives

    id time value
a.1  a   -1   100
a.2  a    0   200
a.3  a    1     1
a.4  a    2     2
a.5  a    3     3
b.1  b   -1   100
b.2  b    0   200
b.3  b    1     4
b.4  b    2     5
b.5  b    3     6

NOTE: You can remove the rownames by simply calling rownames(df) <- NULL

Sotos
  • 51,121
  • 6
  • 32
  • 66
1

We can nest and add the relevant rows to each nested item :

library(tidyverse)

b %>%
  nest(-id) %>%
  mutate(data= map(data,~bind_rows(a,.x))) %>%
  unnest

# # A tibble: 10 x 3
#       id  time value
#    <chr> <dbl> <dbl>
#  1     a    -1   100
#  2     a     0   200
#  3     a     1     1
#  4     a     2     2
#  5     a     3     3
#  6     b    -1   100
#  7     b     0   200
#  8     b     1     4
#  9     b     2     5
# 10     b     3     6
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
0

Maybe not the most efficient way, but easy to follow:

library(tidyverse)
a <- tibble(time = c(-1, 0), value = c(100, 200))
b <- tibble(id = rep(letters[1:2], each = 3), time = rep(1:3, 2), value = 
1:6)

a.a <- a %>% add_column(id = rep("a",length(a)))
a.b <- a %>% add_column(id = rep("b",length(a)))
joint <- bind_rows(b,a.a,a.b)
(joint <- arrange(joint,id))
Danny
  • 383
  • 2
  • 3
  • 16
  • thank you for your answer, but as I said in the problem formulation I have a lot of different id's (about 20k) so this solution becomes infeasible. – Cettt Jul 11 '18 at 13:28