1

I'm trying to do a simple conditional with mutate.

The code should create a new variable called "gender" based on two variables from same dataframe.

sample <- data.frame(
   client = c("john", "peter", "hanna", "lisa"), 
   id = c(100, 400,  650, 700),
   resident = c('YES', 'YES', 'YES', 'NO'))

 male_index <- as.vector(000:499)
 female_index <- as.vector(500:999)

 sample <- sample %>%
   mutate(gender = ifelse(resident == "YES" & id %in% male_index, "Male", 
   mutate(gender = ifelse(resident == "YES" & id %in% female_index, "Female", "Female"))))

I'm getting the following error, which I don't understand. I guess it has something to do with SE. But I'm still not that familiar with R.

Error in mutate_impl(.data, dots) :
argument ".data" is missing, with no default

I don't get any issues if I run the code with a single mutate statement.

zx8754
  • 52,746
  • 12
  • 114
  • 209
Prometheus
  • 1,977
  • 3
  • 30
  • 57
  • Please don't confuse R and `dplyr`. `dplyr` is a data manipulation package (add on) that is available for the R statistical computing environment. The error that you are getting is an error with `dplyr`, not with R. – lmo Apr 05 '17 at 12:25
  • The first `mutate` has `sample` as its implicit first argument (See `help("%>%")`), because it follows the pipe `%>%`. The second `mutate` does not immediately follow the pipe, so it lacks its first argument. Try `mutate(., gender = ....`. It has nothing to do with SE. It will suppress the error, but I'm not sure it will make more sense though – Aurèle Apr 05 '17 at 12:32

1 Answers1

2

You don't need the second mutate call in your ifelse.

sample <- data.frame(
  client = c("john", "peter", "hanna", "lisa"),
  id = c(100, 400,  650, 700),
  resident = c('YES', 'YES', 'YES', 'NO')
)

male_index <- as.vector(000:499)
female_index <- as.vector(500:999)

sample <- sample %>%
  mutate(gender = ifelse(
    resident == "YES" & id %in% male_index,
    "Male",
    ifelse(resident == "YES" &
             id %in% female_index, "Female", "Non-resident")
  ))

Now each individual in the dataset has an assigned value for gender.

sample
#  client  id resident gender
#1   john 100      YES   Male
#2  peter 400      YES   Male
#3  hanna 650      YES Female
#4   lisa 700       NO Non-resident
Andrew Brēza
  • 7,705
  • 3
  • 34
  • 40
  • 1
    The problem with this approach is that you flag the 4th example (lisa), who is a non-resident, with a gender value. The purpose of the second mutate statement is to flag three values: "male", "female", "non-resident". – Prometheus Apr 05 '17 at 12:25
  • I found a second issue with the code that I just edited in my answer. Your second `ifelse` condition was "Female" but that was the same as your first condition. You're basically asking "Are you male? Then say 'Male'. Otherwise are you female? Then say 'Female'. Match neither of those two? Then go with 'Female'." I changed the third option to "Non-resident" but you can make it whatever you want. – Andrew Brēza Apr 05 '17 at 12:29
  • Always happy to help – Andrew Brēza Apr 05 '17 at 12:36