0

I have a CSV file with twelve fields: the first six represent events, the other six actions. For example:

q,w,e, , , ,a,s,d,f, ,
q,t,y,i, , ,s,f,g, , ,
w,r, , , , ,d,f,g,j,k,l

...and so on (I inserted the blank spaces only for ease of reading, but in the original file there are no spaces).

The order of events in the first six positions is not important (q,w is the same of w,q). The same applies to actions in the last six positions.

I need to find out all the rules like:

single event => list of actions (one or more)

...with a given support and a given confidence. How can I achieve this using the R implementation of the "Apriori" or "FP-Growth" algorithm?

Thanks in advance, Tony

Antonio
  • 11
  • 3
  • Have you converted your cvs file to a transactional data set? When you refer to fields do you mean columns? – Hansel Palencia Nov 22 '19 at 17:15
  • I tried to import the CSV file into a dataframe, to factor all the columns and turn the dataframe into transactions, but the entries are of the type "column_name=value". In this way it seems that the positions of events and actions matter. – Antonio Nov 22 '19 at 17:32
  • This is true, the apriori algorithm automatically takes into play position in the basket. The lift however of q,w would still be the same as w,q. So it's not that big of deal in my opinion. If you turn your dataset into a transactional data set then run the apriori() function with a specific 'lhs' using your events you should get what you are looking for. You'll have to sort through some of the noise with the events to events but you can do that if you convert your rules back to a data frame and just filter out what you don't need – Hansel Palencia Nov 22 '19 at 17:37
  • But in this way I don't have "q,w". I have "Event1=q,Event2=w". In other transactions I have "Event2=q,Event3=w". According Apriori, are these sets equals? I think not. – Antonio Nov 22 '19 at 18:12

0 Answers0