1

I need to combine some named numeric vectors in R into a data frame. I tried cbind.na as suggestet in another question, but it would not take names into account. Example:

v1 <- c(1,5,6,7)
names(v1) <- c("milk", "flour", "eggs", "sugar")
v2 <- c(2,3)
names(v2) <- c("fish", "chips")
v3 <- c(5,7,4)
names(v3) <- c("chips", "milk", "sugar")

The data frame should look like this

       v1    v2     v3
milk   1     NA     7
flour  5     NA     NA
eggs   6     NA     NA
sugar  7     NA     4
fish   NA    2      NA
chips  NA    3      5

I can't figure out how to solve this in R.

Toby
  • 177
  • 6

2 Answers2

3

This is a join, best done with data.table or other add-ins, but (especially for smallish arrays) can readily be performed in base R by creating an array of all the names and using it to index into the input arrays:

s <- unique(names(c(v1,v2,v3)))
x <- cbind(v1=v1[s], v2=v2[s], v3=v3[s])
rownames(x) <- s
print(x)
      v1 v2 v3
milk   1 NA  7
flour  5 NA NA
eggs   6 NA NA
sugar  7 NA  4
fish  NA  2 NA
chips NA  3  5
whuber
  • 2,379
  • 14
  • 23
  • 2
    Nice approach. You can vectorize it so you don't have to type out all the variable names (in case there are > 3): `v <- mget(paste0('v', 1:3)); s <- unique(unlist(lapply(v, names))); x <- data.frame(s, lapply(v, '[', s))` – IceCreamToucan Sep 11 '19 at 16:03
  • @IceCreamToucan Thank you. I considered doing that but thought it might obscure the answer: it's nice to see `cbind` appear prominently in the solution. But for the generalization to many arrays something like your approach is the way to go. – whuber Sep 11 '19 at 16:05
2
# get vectors into one list
v <- mget(paste0('v', 1:3))
# convert vectors to data frames
l <- lapply(v, stack)
# merge them all sequentially
out <- Reduce(function(x, y) merge(x, y, by = 'ind', all = T), l)
# name the columns according to the original vector names
setNames(out, c('ind', names(v)))

#     ind v1 v2 v3
# 1  milk  1 NA  7
# 2 flour  5 NA NA
# 3  eggs  6 NA NA
# 4 sugar  7 NA  4
# 5  fish NA  2 NA
# 6 chips NA  3  5

Edit: I think this is worse than whuber's solution because it requires creating a bunch of intermediate tables, both in the lapply step and in the Reduce step. Haven't done any benchmarks though.

IceCreamToucan
  • 28,083
  • 2
  • 22
  • 38