- i am using R apriori (library(arules))to get rules. want to have all the rules having rhs(right hand side) related to a list of products (not only one)
I put customer segemenation info and product_name also in the 'product' then use the product create transection and rules then I hope to have rules one the rhs only contain the product_name but I can have segemtation appear in lhs.
Hope to have suggestion, not able to achive using apriori(,appearence=(list(rhs=...))) function
and fine strange result describe in the problem 2. as follow
- find out if I give the list longer e.g. c("whole milk", "cereals","other vegetables", "rice", "specialty cheese", "jam") has less rules (3042) than only havingc("whole milk", "cereals","other vegetables") (3077) so not sure why and how to understand the appearence=(list(rhs=...)) function
My original problem not easy to create recreatable code (reading data from database directly )but here are some discriptions;
having a data set having per transection per customer (basket/product_name): cust1 apple pear chips cust2 milk apple wine ....
for each customer also have a segementation (profile) tag cust1 20-30 famle gold cust2 30-40 male siver ...
then I put segementation and profile together 20-30 famle gold apple pear chips 30-40 male siver milk apple win ...
then use this data transform to "transection" for arule apporiori function and get some rules
e.g. {20-30, famle, apple} => {wine} {male, wine} =>{30-40}
but I only interested the rules for rhs having the product name (pear, chips, apple but not 30-40, male , gold)
so I try to use apriori(,appearence=(list(rhs=product_items))) function to acchive
I create a character list called product_items<-data[tag=='product_name']
how ever I find 0 rules. I tried for a while, then find out strange thing is ,if i only use
product_item[1:10]
I got 30000 rules, if increase to
product_item[1:80]
the number of rules reduce to 200...
we can use data(Groceries)
as example,
pro <-c("whole milk", "cereals","other vegetables", "rice", "specialty cheese", "jam")
lp<- c("whole milk", "cereals","other vegetables")
r2<- apriori (Groceries,
parameter = list(supp = 0.001, conf = 0.5),
appearance = list(rhs = pro)
)
r1 <- apriori (Groceries,
parameter = list(supp = 0.001, conf = 0.5),
appearance = list(rhs = lp)
)
then for summary(r2)
we have 3042 rules but for summary(r1)
bwe have 3077 rules
so this make me question my understanding about apriori(,appearence=(list(rhs=....))) function . I thought if I create a character list or vector product_name have more products, which means, I have more rules. because it means I will have any rules have out put (rhs) match any of the product_name I put in the function.
but the the increase of the item int product_name is reducing the number of rules let me question my understanding.
would be nice have any of your explaination
And I still need to find out all the rules relate to rhs could be any or the item in product_name, any suggestion how to relize it?
say I want all the rules fit my requement of supp, conf and also in the rhs contain anhy of ("whole milk", "cereals","other vegetables", "rice", "specialty cheese", "jam")
how should I do that? thank you!