0

I have a dataset with a categorical variable hospital_code which has 10 levels.

The program that I am running loops through and takes a subset of the data such that the variable compLbl contains exactly 2 of the 10 hospital_codes so that they can be compared to each other. I now have a situation where in each loop, I need compLbl to be binary coded (1s, and 0s).

If I just take the subset data from the first loop in which the possible values for compLbl are AMH, and BJH, I can easily do this as follows:

nData$compLbl2 = with(nData,(ifelse(compLbl == "AMH", 1,0)))

And get data that looks like this:

head(nData)
compLbl outLbl Race_Code Age Complexity_Subclass_Code compLbl2
1     AMH      0         W  63                        1        1
2     AMH      0         W  44                        2        1
3     AMH      0         W  88                        3        1
4     BHC      0         W  64                        1        0
5     BHC      0         W  61                        2        0
6     BHC      0         W  61                        1        0

How can I generalize this so that no matter what two values are in compLbl it will binary code them? My thought was to possibly do this by referencing factor level 1 for whatever two values are present in the factor variable compLbl. Like this:

nData$compLbl2 = with(nData,(ifelse(FACTORLEVEL(compLbl) == 1, 1,0)))

Where in my above example FACTORLEVEL(compLbl) would return a 1 for AMH and a 2 for BHC since those are the factor levels that R would automatically assign. However, I'm not sure how to do this, or if it is possible.

tonytonov
  • 25,060
  • 16
  • 82
  • 98
Skye
  • 13
  • 5

1 Answers1

0

I would use this command:

nData <- within(nData, compLbl2 = rev(as.numeric(compLbl[drop = TRUE]) -1))
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168