I'm running Mahout 0.7 on hadoop 1.0.4. I want to see the result of Reuters dataset for the topic modeling task. However, I'm getting kinda useless result when I use the vectordump tools in Mahout.
I've read the following set of instructions for this example:
Run cvb in mahout 0.8.
but after executing vectordump tools, I receive a huge file in the output which contains something like the following lines: {0.01:5.726429339702471E-12,0.05:6.196569958376538E-9,...}
which I'm not sure if this is the actual output we are supposed to see for the Reuters dataset.
Asked
Active
Viewed 580 times
0

Community
- 1
- 1

Yaser Kenesh
- 81
- 1
- 9
2 Answers
0
The same thing has happened and the solution is simple: get their latest version in their svn server: http://svn.apache.org/repos/asf/mahout/trunk
That happens because there is a bug of vectorSize in Mahout 0.7.

mystik
- 3
- 1
0
I think they haven't provided that type of output you are looking for https://issues.apache.org/jira/browse/MAHOUT-1470