Create a new column based on an index column

Question

I have a dataset containing n observation and a column containing observation indices, e.g.

col1 col2 col3 ID
12    0    4    1
6     5    3    1
5     21   42   2

and want to create a new column based on my index like

col1 col2 col3 ID col_new
12    0    4    1   12
6     5    3    1   6
5     21   42   2   21

without for loops. Actually I'm doing

col_new <- rep(NA, length(ID))
for (i in 1:length(ID))
{
   col_new[i] <- df[i, ID[i]]
}

Is there a better or (tidyverse) way?

markdly · Answer 1 · 2017-10-23T09:31:36.840

5

For a possible tidyverse approach, how about using dplyr::mutate combined with purrr::map2_int.

library(dplyr)
library(purrr)

mutate(df, new_col = map2_int(row_number(), ID, ~ df[.x, .y]))
#>   col1 col2 col3 ID new_col
#> 1   12    0    4  1      12
#> 2    6    5    3  1       6
#> 3    5   21   42  2      21

Data

df <- read.table(text = "col1 col2 col3 ID
12    0    4    1
6     5    3    1
5     21   42   2", header = TRUE)

edited Oct 23 '17 at 09:31

answered Oct 23 '17 at 09:06

markdly

4,394
2
19
27

Really nice. I have to admit that I tried hard with dplyr but wasn't aware of map2_int. – Yann Oct 23 '17 at 09:23
2

There is `row_number()` which you can use instead of `1:n()` – akrun Oct 23 '17 at 09:28
2

Thanks @akrun, good suggestion. I've edited to include `row_number()` - I think that it makes the code more readable. – markdly Oct 23 '17 at 09:33

score 4 · Accepted Answer · answered Oct 23 '17 at 08:59

4

We can use row/column indexing from base R which should be very fast

df1$col_new <- df1[1:3][cbind(seq_len(nrow(df1)), df1$ID)]
df1$col_new
#[1] 12  6 21

answered Oct 23 '17 at 08:59

akrun

874,273
37
540
662

Although I really like dplyr and purrr, I will accept this answer because the intent of the code is much clearer. – Yann Oct 23 '17 at 09:22

pogibas · Answer 3 · 2017-10-23T08:46:08.077

2

Solution using data.table:

library(data.table)
# Using OPs data
setDT(df)
df[, col_new := get(paste0("col", ID)), 1:nrow(df)]

# df
   col1 col2 col3 ID col_new
1:   12    0    4  1      12
2:    6    5    3  1       6
3:    5   21   42  2      21

Explanation:

For each row: 1:nrow(df)
Get corresponding column using ID: get(paste0("col", ID))
Write this value into a new column: col_new :=

edited Oct 23 '17 at 08:46

answered Oct 23 '17 at 08:29

pogibas

27,303
19
84
117

score 1 · Answer 4 · answered Nov 05 '17 at 01:07

Another tidyverse approach, this time that uses only tidyr and dplyr:

df %>%
    gather(column, col_new, -ID)  %>%  
    filter(paste0('col', ID) == column) %>%
    select(col_new) %>%
    cbind(df, .)

It's longer than @markdly's elegant one-liner but if you're like me and get confused by purrr most of the time, this might read easier.

Create a new column based on an index column

4 Answers4

Linked