1

I am qualitatively coding a dataset based on theme. Each observation is allowed two themes, therefore I have two columns with the same variable list. When I run arules, it see "v1=alpha; v2=beta" as different item than "v1=beta;v2=alpha." As below,

| V1 | V2 |

| -------- | ----- |

| ALPHA | BETA |

| BETA | ALPHA |

Here's my code:

  pr_itemset<-apriori(
     pr_trans,parameter=list(
     target="frequent",support=.001,minlen=2,maxlen=4))
     
ddubs1978
  • 19
  • 1

1 Answers1

0

These two rows are different. If you actually want the items to be ALPHA and BETA without the V1 and V2 because each row represents a set of items then you should start with a list of sets (represented as character vectors). The code would look like this:

library("arules")
mysets <- list(
   c('ALPHA', 'BETA'),
   c('BETA', 'ALPHA')
   )
trans <- transactions(mysets)

inspect(trans)
    items        
[1] {ALPHA, BETA}
[2] {ALPHA, BETA}

identical(trans[1], trans[2])
[1] TRUE
Michael Hahsler
  • 2,965
  • 1
  • 12
  • 16