In R, how can I test if two factors are equivalent?

Question

I am generating a big list of factors with different levels, and I want to be able to detect when two of them define the same partition. For example, I want to detect all of the following as equivalent to each other:

x1 <- factor(c("a", "a", "b", "b", "c", "c", "a", "a"))
x2 <- factor(c("c", "c", "b", "b", "a", "a", "c", "c"))
x3 <- factor(c("x", "x", "y", "y", "z", "z", "x", "x"))
x4 <- factor(c("a", "a", "b", "b", "c", "c", "a", "a"), levels=c("b", "c", "a"))

What is the best way to do this?

score 5 · Accepted Answer · answered Sep 28 '12 at 23:23

5

I guess you want to establish that a two-way tabulation has the same number of populated levels as a one way classification. The default setting in interaction is to represent all levels even if not populated but setting drop=TRUE changes it to suit your purpose:

> levels (interaction(x1,x2, drop=TRUE) )
[1] "c.a" "b.b" "a.c"
> length(levels(x1) ) == length(levels(interaction(x1,x2,drop=TRUE) ) )
[1] TRUE

The generalization would look at all( <the 3 necessary logical comparisons> ):

 all( length(levels(x1) ) == length(levels(interaction(x1,x2,drop=TRUE) ) ),
      length(levels(x1) ) == length(levels(interaction(x1,x3,drop=TRUE) ) ),
      length(levels(x1) ) == length(levels(interaction(x1,x4,drop=TRUE) ) ) )
#[1] TRUE

answered Sep 28 '12 at 23:23

IRTFM

258,963
21
364
487

1

I find it useful to visualize this method with `table(x1, x2)`. You can see that each column (and row) has only a single non-zero entry. – bdemarest Sep 29 '12 at 00:28
1

To use `table(x1,x2)` in a programmatic fashion you would need something like `sum(table(x1,x2) != 0 )`. – IRTFM Sep 29 '12 at 00:51
`interaction` can be slow for large vectors, which can be sped up by using `paste` instead. – Empiromancer Mar 02 '17 at 22:28
I'm always willing to learn new things, but I do so better by seeing well constructed demonstrations. – IRTFM Mar 03 '17 at 01:15

In R, how can I test if two factors are equivalent?

1 Answers1