1

I have a stream of user-item pairs, hold a block based on last 6M records and update it each minute. I don't like that between these rebuilds some important data might be unused. For example new user has joined the system, but the model doesn't know about him yet. I've found class PlusAnonymousConcurrentUserDataModel, which allows to add few entries to the model and get more accurate recommendation. Documentation proposes more constrained usage scenario for it yet: I have to:

  • allocate temporary user
  • add extra data
  • get recommendation
  • and then release user and extra data

Is it ok to use this class for collecting data iteratively till model is actually rebuilt by timer? What is the right way to do this? It seems that PlusAnonymousConcurrentUserDataModel is a bit for different purposes.

Stepan Yakovenko
  • 8,670
  • 28
  • 113
  • 206
  • 1
    It seems like it is only for getting a recommendation for new unregistered users. The docs say this "is useful in a situation where you wish to recommend to a user that doesn't really exist yet in your actual DataModel." I don't think it fits your scenario. Reference: https://mahout.apache.org/docs/0.13.0/api/docs/mahout-mr/org/apache/mahout/cf/taste/impl/model/PlusAnonymousUserDataModel.html – Brian Clink Nov 06 '18 at 22:00

1 Answers1

1

This part of Mahout is very old an being deprecated. I think it is not even in the 0.14.0 build, you would have to build from source.

Mahout now uses a whole new technology for recommending. The new algorithm is called Correlated Cross-Ocurrence (CCO). The old method you are using does not make use of real time input as you have outlined. CCO can recommend to anonymous users that have not been built into the model as long as there is behavioral data for them in some form.

The architecture to implement CCO requires a datastore in a DB and a KNN engine (search engine) to make model queries. These are all packaged together in Apache PredictionIO + the Universal Recommender template.

Community support for the Universal Recommender itself can be found here: https://groups.google.com/forum/#!forum/actionml-user or on the mailing lists of the other projects.

pferrel
  • 5,673
  • 5
  • 30
  • 41
  • Tried predictionIO, but couldn't install it because of critical bugs like that: https://github.com/actionml/PredictionIO – Stepan Yakovenko Nov 17 '18 at 18:20
  • You should be installing Apache PredictionIO from the Apache repo. That link is an old fork. See instructions here: http://predictionio.apache.org/ and the repo here: https://github.com/apache/predictionio as mentioned in the above answer. – pferrel Nov 18 '18 at 19:15
  • Used that repo, but the setup has a reference to website which is missing. This repo has no issues button so I reported to the original one. – Stepan Yakovenko Nov 19 '18 at 07:03
  • The Apache repo is managed by Apache using their OSS rules, meaning you submit JIRA bug reports and no idea what bug you are talking about. Apache is a big org and this is a top level project. It sounds to me like you are not following the typical use of Apache software. See the PredictionIO site for installation instructions. Don't expect the repo to document the project there is a whole site to do that here: http://predictionio.apache.org/ – pferrel Nov 20 '18 at 16:35
  • looks like bug link was broken by SO, lets try again: https://github.com/actionml/PredictionIO/issues/20 – Stepan Yakovenko Nov 20 '18 at 17:01
  • yeah, install.prediction.io is dead, so its not possible to finish perscribed install procedure – Stepan Yakovenko Nov 20 '18 at 17:03