-1

Using this df:

  DF = data.frame(m=rep(1:2,2), y=rep(1998:1999,each=2), A=c(2:5), B=c(4,NA,6,7))

> DF
  m    y A  B
1 1 1998 2  4
2 2 1998 3 NA
3 1 1999 4  6
4 2 1999 5  7

How could I replace a single cell using as coordinates this values:

m = 2 ; y = 1999 ; col = 'A' ; val = 72

Following those values I want to replace 5 with 72.


Edit. As testing all the answers I realized my question is very basic and don't represent my problem. I tried to do it without for loops but failed and eventually used it.

So, I want to replace values within the DF data frame but using this other data frame:

repl = data.frame(m=c(2,1), y=c(1999,1998), col=c('A','B'), val=c(72,100))
> repl
  m    y col val
1 2 1999   A  72
2 1 1998   B 100

This means that each row of the repl data frame is a value to replace in DF.

I've been trying to use Psidom answer mutate(A = replace(A, m == 2 & y == 1999, 72) for each row but wonder if can be done without loops or without using column names.

Thank you.

noriega
  • 447
  • 1
  • 9
  • 23

2 Answers2

2

The dplyr way is mutate + if_else:

DF %>% mutate(A = if_else(m == 2 & y == 1999, 72L, A))

#  m    y  A  B
#1 1 1998  2  4
#2 2 1998  3 NA
#3 1 1999  4  6
#4 2 1999 72  7

Or mutate + replace:

DF %>% mutate(A = replace(A, m == 2 & y == 1999, 72))

#  m    y  A  B
#1 1 1998  2  4
#2 2 1998  3 NA
#3 1 1999  4  6
#4 2 1999 72  7

which depending on the condition, returns a new vector with intended values replaced.


Update if you need to do many updates at the same time, you can:

1) reshape DF so the columns to be updated get gathered in a single column;

2) join on the two condition columns m and y plus the column headers column;

3) update the values;

4) reshape the data frame back;

So together with tidyr, you can do:

library(dplyr); library(tidyr)

DF %>% 
    gather(col, vals, -m, -y) %>% 
    left_join(repl, by = c("m", "y", "col")) %>% 
    mutate(vals = coalesce(val, vals)) %>% 
    select(-val) %>% 
    spread(col, vals)

#  m    y  A   B
#1 1 1998  2 100
#2 1 1999  4   6
#3 2 1998  3  NA
#4 2 1999 72   7
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • Great. How would it be with indexes rather than column names? Should I do `DF %>% mutate(.[3] = replace(.[3], m == 2 & y == 1999, 72))` ? – noriega Aug 15 '17 at 23:11
  • You can use `mutate_at` and pass the column positions to `.var` variable. `DF %>% mutate_at(.vars = 3, funs(replace(., m == 2 & y == 1999, 72)))`. – Psidom Aug 15 '17 at 23:13
  • Nice. What if there are more than one column to work with? I mean: replace m=2, y=1999, col='A' with 72 **and** m=1, y=1998, col='B' with 100 – noriega Aug 15 '17 at 23:22
  • In that case, you will probably have to write two replace statements as `mutate(A = replace(A, m == 2 & y == 1999, 72), B = replace(B, m == 1 & y == 1998, 100))`. – Psidom Aug 15 '17 at 23:26
  • Ok. I edited the question as I noticed there are many replacements I have to do and tried to put the question in a more general way. – noriega Aug 16 '17 at 00:04
1

A one-line method in base-R for your follow up question would be

lapply(1:nrow(repl), function(i) 
       DF[DF$m==repl$m[i] & DF$y==repl$y[i], repl$col[i]] <<- repl$val[i])

DF
  m    y  A   B
1 1 1998  2 100
2 2 1998  3  NA
3 1 1999  4   6
4 2 1999 72   7

This goes through each row of repl and makes the changes to DF. The <<- forces it to make the changes to the original DF in the global environment.

Andrew Gustar
  • 17,295
  • 1
  • 22
  • 32