0

I want create sequence file form array list of files address with haddop's API in one machine. Then give this output to sparse-vector and then clustering. For do sparse and clustering, I get code from here. ReutersToSparseVectors and KMeansClustering. In here is code for write sequence file. But, when I give sequence output to sparse, program give error.

  • 1
    The code there should work for any file system for which Hadoop has an implementation. The URI determines which file system to use. Make sure the URI says "file:///path/to/file.seq" and it will use the local filesystem instead of HDFS. – Judge Mental Jul 31 '12 at 07:41
  • So, why there exist different for structure sequence file, between this code and command base? – Arash Hosseinabady Jul 31 '12 at 09:31
  • Where is a different structure? A sequence file is still a sequence file, it is just a matter where it is stored. – Thomas Jungblut Jul 31 '12 at 10:11
  • Because, when enter this generation(sequence file) to input address for `sparse vectors` operation, this program give me error. What do you means about `where it is stored` in your comment? – Arash Hosseinabady Jul 31 '12 at 10:45
  • Unless, text file isn't input for creating sequence. – Arash Hosseinabady Jul 31 '12 at 11:01
  • What error does the sparse vectors program give, exactly? That would clear up a lot. – Judge Mental Aug 01 '12 at 20:48
  • I forgot tell you that I give array list for input sequence(I edit my question). It's error is that cannot access to input file. – Arash Hosseinabady Aug 02 '12 at 08:34
  • Do you think that union array list file into one file, can a way for solve this problem? And after K-means clustering operation, can I understand number of clustering of every file? – Arash Hosseinabady Aug 02 '12 at 08:50

0 Answers0