I was trying to use the recomendItembased algorithm in Mahout but got stuck with an error. Your guidance would help a lot. I downloaded the Cloudera CDH 5.4 VM and was running this on it. I created the following sample file in the directory 'input' and tried to run the in-built algorithm. Below is the sample data for file I used in .txt format:
1,3,3 2,4,2 3,4,1 5,2,4 5,4,1 3,3,1 2,2,3 1,5,5 6,2,1 6,4,4
But I got the following java.arrayindexoutofbounds error.
Below is the link detailing the use of this algorithm http://mahout.apache.org/users/recommender/intro-itembased-hadoop.html
The error is provided below:
15/10/12 06:57:54 INFO mapreduce.Job: The url to track the job: http://quickstart.cloudera:8088/proxy/application_1444042848266_0006/
15/10/12 06:57:54 INFO mapreduce.Job: Running job: job_1444042848266_0006
15/10/12 06:58:56 INFO mapreduce.Job: Job job_1444042848266_0006 running in uber mode : false
15/10/12 06:58:56 INFO mapreduce.Job: map 0% reduce 0%
15/10/12 07:06:06 INFO mapreduce.Job: map 50% reduce 0%
15/10/12 07:06:14 INFO mapreduce.Job: map 100% reduce 0%
15/10/12 07:06:20 INFO mapreduce.Job: map 0% reduce 0%
15/10/12 07:06:32 INFO mapreduce.Job: Task Id : attempt_1444042848266_0006_m_000000_0, Status : FAILED
Error: java.lang.ArrayIndexOutOfBoundsException: 1
at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:50)
at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)