1

I've followed this tutorial :

https://ademsha.com/notes/developing-recommendation-system-with-apache-mahout/

It worked fine with ml-100k dataset, but when I added more evaluations to the u.data file (using exactly the same format userId, itemId, rating, timestamp), I get the following error:

[main] INFO org.apache.mahout.cf.taste.impl.model.file.FileDataModel - Creating FileDataModel for file u.data
Exception in thread "main" java.lang.IllegalArgumentException: Did not find a delimiter in first line
at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.determineDelimiter(FileDataModel.java:351)
at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.<init>(FileDataModel.java:201)
at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.<init>(FileDataModel.java:169)
at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.<init>(FileDataModel.java:149)
at teste.HelloMaven.SimpleRec.main(SimpleRec.java:20)

The code is this :

public class SimpleRec {

    public static void main(String[] args) throws Exception {

        DataModel model = new FileDataModel(new File("u.data"));

        UserSimilarity similarity = new PearsonCorrelationSimilarity(model);

        UserNeighborhood neighborhood = new ThresholdUserNeighborhood(0.5, similarity, model);

        Recommender recommender = new GenericUserBasedRecommender(model,neighborhood, similarity);

        List<RecommendedItem> recomendations = recommender.recommend(943,5);

        for (RecommendedItem recomendation : recomendations) {
            System.out.println(recomendation);
        }
    }
}
António Ribeiro
  • 4,129
  • 5
  • 32
  • 49
Renato Melo
  • 174
  • 1
  • 9
  • How are you delimiting your new values? The values must be delimited using `tabs` not spaces. – António Ribeiro Feb 21 '16 at 20:37
  • i am using the format of ml-dataset userID " " itemId " " rating " " timestamp"\n" – Renato Melo Feb 21 '16 at 20:38
  • Ok, but to separate your values, what type of delimiter are you using? A `space` or a `tab`? – António Ribeiro Feb 21 '16 at 20:39
  • sorry i sent the comment acidentaly, now i edit the previous comment – Renato Melo Feb 21 '16 at 20:42
  • Ok, I'm assuming that you added more values to the `u.data` file by hand. Therefore, each value of the new lines you've added must be separated by a `tab` character: value1**tab**value2**tab**value3**newline**. – António Ribeiro Feb 21 '16 at 20:46
  • I put new values by java file manipulation functions, and now i tried to change space to tab by hand, and did not worked :( edit: I tried to test another thing, i erased all the new lines i added, the file looks exactly like the file of ml-dataset again, and i get the error... Is something like an encoding issue ? – Renato Melo Feb 21 '16 at 20:55
  • Oh, my code was rewriting the file without the tabs before to acess the file. I edited the code to put tabs and now the error is gone> Thanks aribeiro !!! – Renato Melo Feb 24 '16 at 13:43

0 Answers0