0

I want to convert some variable types in R from factors to binary asymmetric variable types.

I successfully converted some of my data from factors to ordered factors using this syntax: mydata[,200] <-as.ordered(mydata[,200]), but when I tried something similar to convert them to binary asymmetric variables, I was unsuccessful, and have had trouble finding any information about how to do this online or in the book I have. I need them to be specified as asymmetric because I'm going to be using the daisy function to look at dissimilarities. If anyone could tell me how to convert from factors to binary asymmetric I would be incredibly grateful.

Edit: To answer the question about the asymmetric vs. symmetric variables: the main difference between a binary symmetric and binary asymmetric is in symmetric variables, both carry the same weight (is that person male or female) while in asymmetric variables one is more important than the other. It doesn't matter if people don't share a characteristic, it only matters if they do. So for example, people who are color-blind have something in common, but people who are not color-blind do not.

So, what I'm looking to do is set it up to where essentially 0=unimportant, 1=important. From what I've read (Kaufmann & Rousseeuw 1990) it is important to make the distinction that these are asymmetric when doing dissimilarities.

structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, NA, 1L, NA, 
1L, 1L, 1L, NA, NA, 1L, 1L, 1L, NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor")

Further edits: I don't believe I need a dummy variable, as it is already technically a binary (everything is either 1, 0, or NA) - I just don't know how to make R change the variable into an asymmetric binary variable rather than a factor.

extragum01
  • 11
  • 1
  • Can you post a data example? And the criterion for the transformation? To post data edit the **question** with the output of `dput(head(mydata[, cols you want], 20))`. – Rui Barradas Apr 26 '18 at 16:20
  • 1
    I agree with the accepted answer to the Meta [Downvoting of new user questions](https://meta.stackexchange.com/questions/3515/downvoting-of-new-user-questions/3524) – Rui Barradas Apr 26 '18 at 16:26
  • What is the difference between a binary variable and a binary asymmetric variable? The only thing I have found online seems to be one of intent. The mechanics of converting to a binary variable shouldn't depend on the semantics of what the resulting variable means. In any event, `ifelse(condition,0,1)` with the appropriate choice of `condition` is a natural way to proceed. – John Coleman Apr 26 '18 at 17:29

2 Answers2

1

It will be helpful to add an example of input and desired output. I believe you are looking for something called dummy variables.

    col1
row1  a   
row2  b
row3  a 

transformed into

      a b
row1  1 0
row2  0 1
row3  1 0

If that is what you mean by converting factor variables to binary asymmetric variables, please checkout dummies package which does that in R.

penguin
  • 1,267
  • 14
  • 27
0

One way is to first use as.vector() to convert your factor into a character vector with the levels 1 and 2 replaced by their labels "0" and "1" and then use as.numeric() to convert the result to the numbers 0 and 1:

v <- structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, NA, 1L, NA, 
                 1L, 1L, 1L, NA, NA, 1L, 1L, 1L, NA, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
                 1L, 1L, 1L, 1L, 1L), .Label = c("0", "1"), class = "factor")

v <- as.numeric(as.vector(v))

Equivalently though perhaps more obscurely you could replace the last line by

v <- as.numeric(v) - 1

The as.numeric() takes the factor and converts it to a vector of levels (which are the numbers 1 and 2) and then subtracting 1 takes the result down to 0 and 1. In either case, you get a binary numeric vector:

> v
 [1]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0
[27] NA  0 NA  0  0  0 NA NA  0  0  0 NA  0  0  0  0  0  0  0  0  0  0  0  0
John Coleman
  • 51,337
  • 7
  • 54
  • 119