-1

I am working with a column of employment data. I want to end with the following values:

  • Unemployed
  • Retired
  • Self-employed
  • Disabled
  • Employed

I have cleaned up all the different iterations of all the values except for employed. I am trying to craft a statement that would do something along the lines of:

If not in this list "Unemployed | Retired | Self-Employed | Disabled" change value to "Employed".

I have been attempting the use of the %notin% function and the replace() function but am missing something. Any help pointing me in the correct direction would be greatly appreciated.

UPDATE/EDIT:

I got code to work based on the suggestion from @Rui Barradas, but when cleaning up and notating the code I broke something and I can't for the life of me figure out what I am doing wrong. The code below does not throw an error but it is not changing the values to 'Employed' when I verify with table(df7$patient_employment)

`%notin%` <- Negate(`%in%`)
x <- c(df7$patient_employment, "Unemployed", "Retired", "Self-Employed", "Disabled")
x[x %notin% df7$patient_employment] <- "Employed"

RESOLVED:

After some additional help it was pointed out that I was utilizing x from the example when I should have been utilizing my data names. Being working on this for too long. Time to stretch my legs. Thank you @Rui Barradas

  • 1
    `dat$status[! dat$status %in% c("Unemployed", "Retired", "Self-Employed", "Disabled")] <- "Employed"`. (Replace `dat` with your data.frame name, and `status` with the column name.) (If this doesn't work, please [edit] your question and provide the output of `dput(head(dat))`. Thanks!) – r2evans Jan 30 '21 at 20:54
  • @r2evans while i could not get this to work i am confident it is likely due to my inexperience and I appreciate you taking the time to respond. I was able to get it to work with a different solution. Thank you – Retep Yarrum Jan 30 '21 at 22:48

1 Answers1

0

See if the following answers the question.

`%notin%` <- Negate(`%in%`)

set.seed(2020)
status <- c("Unemployed", "Retired", "Self-Employed", "Disabled")
x <- sample(c(status, "Employed", "ABC"), 20, TRUE)

i <- x %notin% status
x[i]
#[1] "ABC"      "ABC"      "Employed" "ABC"      "Employed"
#[6] "Employed"

x[i] <- "Employed"
x[i]
#[1] "Employed" "Employed" "Employed" "Employed" "Employed"
#[6] "Employed"

The code above is simple enough to not need the logical index vector i. This vector was created to make the code more readable but the following is equivalent to the code above.

x[x %notin% status] <- "Employed"

After the OP's comment, instead of x use df7$patient_employment and it should work.

df7$patient_employment[df7$patient_employment %notin% c("Unemployed", "Retired", "Self-Employed", "Disabled")] <- "Employed"
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • Can you explain your comment about not needing the logical index vector? Or perhaps edit your comment to only include what is needed? I took it to mean that this should work `%notin%' <- Negate(`%in%`) x <- c(df7$patient_employment, "Unemployed", "Retired", "Self-Employed", "Disabled") x[x %notin% df7$patient_employment] <- "Employed" ` – Retep Yarrum Jan 30 '21 at 22:09
  • @RetepYarrum I first present a version in 2 code lines to make it more understandable. See the edit. – Rui Barradas Jan 30 '21 at 22:23
  • I got it to work once and then when I was noting and cleaning up my code I messed something up and haven't been able to get it to work again... I will update my original question with more info – Retep Yarrum Jan 30 '21 at 22:28