2

I have an original data.frame and I would like to run lapply on certain columns then cbind the remaining columns with the results from the lapply operation.

See code below. I would ideally like b to contain the first id column from the data.frame with the results from the lapply. I am assuming my error is that my list argument to cbind contains a list of lists...the first argument to list is a vector, which could be handled, but the second argument is an actual list itself. Just wondering how to handle this.

Thanks!

df <- data.frame(id = 1:10,
                 colB = 11:20,
                 colC = 21:30)

a <- lapply(df[,2:3],
            function(x) {x = 10 * x}
)

b <- do.call(cbind,
             list(df[,1],
                  a))

Created on 2019-02-16 by the reprex package (v0.2.0).

Prevost
  • 677
  • 5
  • 20

3 Answers3

4

The difference is subtle but important: for your code to work the way you want it you need

b <- do.call(cbind, list(df[1], a))
#                        ^^^^^

Result

b
#   id colB colC
#1   1  110  210
#2   2  120  220
#3   3  130  230
#4   4  140  240
#5   5  150  250
#6   6  160  260
#7   7  170  270
#8   8  180  280
#9   9  190  290
#10 10  200  300

The difference is that df[1] returns a data.frame while df[,1] returns a vector. cbind has a method for data.frames which is what get's called in above case, but not in your case.

markus
  • 25,843
  • 5
  • 39
  • 58
  • In general, will `do.call` treat a list element that is a list of `data.frames` - (so `do.call(cbind, list(df1, list_of_dfs))` where list_of_dfs is a list of dataframes; a list of lists in essences - as a series of dataframes? If `a` contained a plot or some other object, this may not work. In hindsight I may have over simplified my example... :) – Prevost Feb 17 '19 at 00:43
  • @Prevost Yes and yes. You might ask a new question that includes this new requirement. – markus Feb 17 '19 at 10:38
1

You can cbind only two data.frames.

So, rewrite b as:

b <- cbind(df[1], as.data.frame(a))

> b
   id colB colC
1   1  110  210
2   2  120  220
3   3  130  230
4   4  140  240
5   5  150  250
6   6  160  260
7   7  170  270
8   8  180  280
9   9  190  290
10 10  200  300

Note that df[1] is used to retain data.frame status.

Sai SL
  • 121
  • 4
0

You could also use bind_cols() from the dplyr package, which can also work with lists.

library(dplyr)
bind_cols(df[1], a)

bind_cols() expects all the arguments to be either data frames, named lists or named arguments. This is way you need to write df[1] instead of df[, 1] also in this case. However, if you explicitly name the argument, also df[, 1] will work:

bind_cols(id = df[, 1], a)
Stibu
  • 15,166
  • 6
  • 57
  • 71