I want create sequence file form array list of files address with haddop's API in one machine. Then give this output to sparse-vector
and then clustering
. For do sparse and clustering, I get code from here. ReutersToSparseVectors
and KMeansClustering
.
In here is code for write sequence file. But, when I give sequence output to sparse, program give error.
Asked
Active
Viewed 489 times
0

Arash Hosseinabady
- 31
- 1
- 6
-
1The code there should work for any file system for which Hadoop has an implementation. The URI determines which file system to use. Make sure the URI says "file:///path/to/file.seq" and it will use the local filesystem instead of HDFS. – Judge Mental Jul 31 '12 at 07:41
-
So, why there exist different for structure sequence file, between this code and command base? – Arash Hosseinabady Jul 31 '12 at 09:31
-
Where is a different structure? A sequence file is still a sequence file, it is just a matter where it is stored. – Thomas Jungblut Jul 31 '12 at 10:11
-
Because, when enter this generation(sequence file) to input address for `sparse vectors` operation, this program give me error. What do you means about `where it is stored` in your comment? – Arash Hosseinabady Jul 31 '12 at 10:45
-
Unless, text file isn't input for creating sequence. – Arash Hosseinabady Jul 31 '12 at 11:01
-
What error does the sparse vectors program give, exactly? That would clear up a lot. – Judge Mental Aug 01 '12 at 20:48
-
I forgot tell you that I give array list for input sequence(I edit my question). It's error is that cannot access to input file. – Arash Hosseinabady Aug 02 '12 at 08:34
-
Do you think that union array list file into one file, can a way for solve this problem? And after K-means clustering operation, can I understand number of clustering of every file? – Arash Hosseinabady Aug 02 '12 at 08:50