0

I have a transaction data and I'm trying to get a count of all the possible combination. The problem I'm getting is that it seems to over count my combinations. For example: given the following item sets:

A {1,2,3}

B {1,2,3,4}

if I want to count the number of times that {1,2,3} occurs together, it results in a count of 2, and not 1 as I want it to.

created dummy data below as an example

t1 <- data.frame(ID = c("A","A", "A", "B", "B", "B", "B"), num = c(1,2,3,1,2,3,4))
transactions<-split(t1[,"num"], t1[,"ID"], sep =",")
test <- apriori(transactions, parameter = list(support =.0000000001, minlen=3, maxlen = 3, target = 'frequent'))
inspect(test)

for this example, I'm expecting the {1,2,3} to have a count of 1 (# of times that just 1,2,3 are purchased together), but I'm not sure why it's giving me all the other numbers.

semidevil
  • 69
  • 1
  • 11

2 Answers2

0

Here is one way using base R. Collapse the num column into one comma-separated string by ID and count their occurrences using table

df1 <- aggregate(num~ID, t1, function(x) toString(unique(x)))
table(df1$num)

#   1, 2, 3 1, 2, 3, 4 
#         1          1 

We can also use dplyr to do this

library(dplyr)

t1 %>%
  group_by(ID) %>%
  summarise(num = toString(unique(num))) %>%
  count(num)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

Frequent itemset mining counts subsets. {1,2,3} is a subset of both transactions and that is why you get a count of 2. It seems like you want to do something else.

Michael Hahsler
  • 2,965
  • 1
  • 12
  • 16