0

I would like to set a specific of time when training recommendation algorithm, for instance: I only want to train data of 6 months back or 3 months back. Does anybody know the way to config or I have to implement it? Thank you very much.

Le Kim Trang
  • 369
  • 2
  • 5
  • 17

2 Answers2

0

We are working on a more general method to solve this but there is an engine in the experimental branch that will retain a period of event data in the EventStore and discard the rest. This is the most performant way to do the training. I does mean that you need to run this engine periodically. https://github.com/PredictionIO/PredictionIO/tree/develop/examples/experimental/scala-cleanup-app

This will remove all old events including $set so be aware. If you $set a property and this event is removed, the property is no longer set.

In the future we plan to support a TTL for events so the event store itself ages events out automatically. We will also store the properties in a mutable store to remove them from the event stream. For now use the cleanup engine.

pferrel
  • 5,673
  • 5
  • 30
  • 41
0

I found that PredictionIO already support this issue:

This is my pseudo code:

       def readEventTime(procTime : String) : Option[DateTime] = {
          Option(if (procTime == null || (procTime != null && procTime == "")) null else  DateTimeFormat.forPattern("yyyyMMddHHmmss").parseDateTime(procTime))
       }
       val itemsViewRDD: RDD[(String, Item)] = PEventStore.find(
          appName = dsp.appName,
          startTime = readEventTime(dsp.startTime),
          untilTime = readEventTime(dsp.untilTime)
       )(sc).map ...

Regards.

Le Kim Trang
  • 369
  • 2
  • 5
  • 17