0

I am trying to recode variables in an R dataframe. Example - variable X from my dataset contains 1's and 0's. I want to create another variables Y which recodes 1's & 0's from X into Yes & No respectively.

I tried this to create the recoded Y variable:

w <- as.character()

for (i in seq_along(x))  {
    if (x[i] == 1)  {
        recode <- "Yes"
    } else if (x[i] == 0)  {
        recode <- "No"       
    }
    w <- cbind(w, recode)
}

Then I did this to line-up X and Y together:

y <- c(x, y)

What I got back was this:

 y
 # [1] "1"   "1"   "0"   "1"   "0"   "0"   "1"   "1"   "0"   "1"   "0"   "0"   "Yes" "Yes" "No"  "Yes" "No"  "No" 

I was expecting a dataframe with X & Y columns.

Question:

  1. How do I get X and Y into a dataframe?
  2. Is there a better way for recoding variables in a dataframe?
etienne
  • 3,648
  • 4
  • 23
  • 37
kyg
  • 105
  • 5

3 Answers3

3

Recoding is generally about applying new labels to the levels of a factor (categorical variable)

In R, you do that like this:

w <- factor(x, levels = c(1,0), labels = c('yes', 'no'))
arvi1000
  • 9,393
  • 2
  • 42
  • 52
  • Hmm, I wonder if you saw the comment from an half hour ago and just copy/pasted... – David Arenburg Dec 07 '15 at 12:44
  • 1
    @David Maybe so, but then the commenter rescinded their right to post it as a proper answer in that half hour themselves. – Konrad Rudolph Dec 07 '15 at 12:51
  • @KonradRudolph its still not appropriate without any attribution. More over [this was already discussed many times on Meta](http://meta.stackoverflow.com/a/251598/3001626). – David Arenburg Dec 07 '15 at 12:53
  • @David Sure but this answer adds more — namely, an explanation. If it had just copied the comment, I’d agree. – Konrad Rudolph Dec 07 '15 at 12:55
  • @KonradRudolph from my experience you won't agree on *anything* I'll say, but this is simple copy/paste with some obvious remarks. It's like copy/pasting homework from your class mate and rewording it a bit so the teacher won't notice. Either way the commenter should have been mentioned here just as an act of appropriate behavior. – David Arenburg Dec 07 '15 at 12:59
  • @David I think you misremember. – Konrad Rudolph Dec 07 '15 at 13:01
  • Did not see the comment, just posted the answer ( which is a very obvious one). – arvi1000 Dec 07 '15 at 13:16
1

Using the following data:

x  <- c(rep.int(0, 10), rep.int(1, 10))
df <- as.data.frame(x)
df
#    x
# 1  0
# 2  0
# 3  0
# ...

I'd create a new variable and recode in one step:

df$y[df$x == 1] <- "yes"
df$y[df$x == 0] <- "no"
df
#    x   y
# 1  0  no
# 2  0  no
# 3  0  no
# ...
# 11 1 yes
# 12 1 yes
# 13 1 yes
# ...

Note for loops are not optimum in R, but your loop is basically correct. You need to replace w <- rbind(w, recode) with w <- cbind(w, recode) in the loop itself and, in the final step, you can cbind x and w:

w <- as.character()
for (i in seq_along(x))  {
  if (x[i] == 1)  {
    recode <- "Yes"
  } else if (x[i] == 0)  {
    recode <- "No"       
  }
  w <- rbind(w, recode)
}
y <- c(x, w)
y

rbind() appends rows, cbind() appends columns, and c() joins two strings together which is why you were getting two lists joined together into one.

Phil
  • 4,344
  • 2
  • 23
  • 33
1

This is one of the many cases where you really shouldn’t use a loop in R.

Instead, use vectorisation, i.e. ifelse or indexing.

result = data.frame(x = x, y = ifelse(x == 1, 'yes', 'no'))

(This assumes that there are only 1s and 0s in the input; if that isn’t the case, you need a nested ifelse or a list containing the translations).

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • Thanks Konrad. Your suggestion works well. But I now have a slightly different example ...i create the following dataframe... x <- c("yes", "yes", "no", "yes", "no") ...y <- c("yes", "no", "no", "yes", "yes") ... df <- cbind(x, y) .... and I do this ....dfNew <- data.frame(x = x, y = y, recode = ifelse((x == "yes") && (y == "yes"), 1, 0)) .... the double-condition of x & y doesn't work. All values in the recoded variable comes back as 1. Please advise. Thanks – kyg Jan 06 '16 at 08:19
  • @KYG Replace `&&` with `&`. – Konrad Rudolph Jan 06 '16 at 08:49