0

I am using a loop to rename columns using cbind. Every time I attempt to name the column using the loop indices (i), it does not use the indices and instead puts in i for the column names. I want it to display the actual names from the original

x <- seq(0, 50, by = 1)
y <- seq(50, 100, by = 1)
z <- seq(25, 75, by = 1)

df <- data.frame(cbind(x, y, z))

df_final <- NULL
for (i in colnames(df)){

#PROBLEM: Column names becomes i instead of the actual column names
df_final <- cbind(df_final, i = df[,i])

}
df_final
aynber
  • 22,380
  • 8
  • 50
  • 63
Sharif Amlani
  • 1,138
  • 1
  • 11
  • 25
  • 2
    Column names after `data.frame(cbind(x, y, z))` are already `x y z`. What are you trying to do in the loop? – Shree Jul 15 '19 at 23:43
  • 1
    Side note: try to avoid creating data frames with `data.frame(cbind(...))`; at best it's redundant and at worst it will lead to unexpected/incorrect results. `data.frame(x = x,y = y,z = z)` is sufficient. – joran Jul 15 '19 at 23:51
  • To your question, `cbind` is the wrong tool for renaming column names. You can create a new column with `df_final[["new_col"]] <- foo` and do this with a name stored in a variable with `df_final[[bar]] <- foo` where `bar <- "new_var"`. – joran Jul 15 '19 at 23:52
  • @joran, do you know if iteratively `cbind`ing has the same consequence (memory-wise) as iteratively `rbind`ing? That is, does a `cbind` operation copy the frame completely each time, or just extend the `list` efficiently? – r2evans Jul 16 '19 at 00:25
  • 1
    @r2evans I guess `cbind` when called on a data frame is literally just a wrapper for `data.frame(...)`, which has been made fairly efficient. I don't see any evidence of copying when I tried a small example with `tracemem`. – joran Jul 16 '19 at 01:24

1 Answers1

2

A simple solution will be to set the colname in the loop like:

df_final <- NULL
for (i in colnames(df)){
  df_final <- cbind(df_final, df[,i])
  colnames(df_final)[ncol(df_final)]  <- i
}
colnames(df_final)
#[1] "x" "y" "z"
str(df_final)
# num [1:51, 1:3] 0 1 2 3 4 5 6 7 8 9 ...
# - attr(*, "dimnames")=List of 2
#  ..$ : NULL
#  ..$ : chr [1:3] "x" "y" "z"

For using the method x[[i]] <- value, x need to have rows:

df_final <- data.frame()[seq_len(nrow(df)),0] #Create empty data frame with rows
for (i in colnames(df)){
  df_final[[i]] <- df[,i]
}
colnames(df_final)
#[1] "x" "y" "z"
str(df_final)
#'data.frame':   51 obs. of  3 variables:
# $ x: num  0 1 2 3 4 5 6 7 8 9 ...
# $ y: num  50 51 52 53 54 55 56 57 58 59 ...
# $ z: num  25 26 27 28 29 30 31 32 33 34 ...

otherwise it will create a list:

df_final <- NULL
for (i in colnames(df)){
  df_final[[i]] <- df[,i]
}
colnames(df_final)
#NULL
str(df_final)
#List of 3
# $ x: num [1:51] 0 1 2 3 4 5 6 7 8 9 ...
# $ y: num [1:51] 50 51 52 53 54 55 56 57 58 59 ...
# $ z: num [1:51] 25 26 27 28 29 30 31 32 33 34 ...

df_final  <- do.call("cbind", df_final)
colnames(df_final)
#[1] "x" "y" "z"
str(df_final)
# num [1:51, 1:3] 0 1 2 3 4 5 6 7 8 9 ...
# - attr(*, "dimnames")=List of 2
#  ..$ : NULL
#  ..$ : chr [1:3] "x" "y" "z"

When the loop is done with sapply instead of for a solution would be:

df_final <- sapply(colnames(df), function(i) {df[,i]})
colnames(df_final)
#[1] "x" "y" "z"
str(df_final)
# num [1:51, 1:3] 0 1 2 3 4 5 6 7 8 9 ...
# - attr(*, "dimnames")=List of 2
#  ..$ : NULL
#  ..$ : chr [1:3] "x" "y" "z"

Or simply subsetting:

df_final <- df[colnames(df)]
colnames(df_final)
#[1] "x" "y" "z"
str(df_final)
#'data.frame':   51 obs. of  3 variables:
# $ x: num  0 1 2 3 4 5 6 7 8 9 ...
# $ y: num  50 51 52 53 54 55 56 57 58 59 ...
# $ z: num  25 26 27 28 29 30 31 32 33 34 ...
GKi
  • 37,245
  • 2
  • 26
  • 48