6

I am facing a tricky problem about sequence mining, say I have 10 products, I have millions of records each containing user, product and timestamp of purchase . Each user may have only 1 record or 100 records.. such as :

user 1, p1, t1
user 1, p1, t2
user 1, p2, t3
user 1, p3, t4
user 1, p1, t5
user 2, p2, t6.....

Now I need to predict when it's the best time to promote a product for a user.

So far, my solution is, clustering the time into a few categories. Then apply Apriori on the data, e.g the records will be like

user 1, p1T1
user 1, p2T2
user 1, p3T2
user 1, p2T1...

Then I will get rules like p1T1->p2T2 etc, because T3>T2>T1... any rules do not fit this condition will be discarded.

However, I am not very satisfied with this solution. Any suggestions?

Johnny000
  • 2,058
  • 5
  • 30
  • 59
yzhang
  • 471
  • 1
  • 5
  • 9
  • Your question is not clear. Please explain what t1 and T1 are. Are they date-time or time of day? What is p1T1? How do you cluster? Also, explain the rationale behind your algorithm and goals. – cyborg Dec 10 '11 at 22:27
  • t1 just means time 1, can be any kinda time. T1 means time cluster 1, does not matter how you cluster it. I just means I cluster them into groups, then used Apriori to find the recommendation. but I think there should be better solution – yzhang Dec 12 '11 at 11:26

2 Answers2

2

Instead of applying Apriori, you could apply a sequential pattern mining algorithm (e.g. PrefixSpan, SPAM, GSP) or a sequential rule mining algorithm.

You can check my website for open-source Java source code for these algorithms and some examples:

http://www.philippe-fournier-viger.com/spmf/

Hope this helps,

Phil
  • 3,375
  • 3
  • 30
  • 46
0

Your problem is an application of recommender system, you can learn something from the KDD cup 2011. Although the items being recommended is music, but the models can also meet your request. And most of the models take time into account, if you still get not satisfied, you should learn something about time series analysis and machine learning to make prediction.

jerry_sjtu
  • 5,216
  • 8
  • 29
  • 42