0

I have M, U, and userRatings part-files as an intermediate result of an ALS matrix factorization process.

The header are:

SEQ. org.apache.hadoop.io.IntWritable%org.apache.mahout.math.VectorWritable

I need to operate with that vectors/features, to find out an explanation for the ALS recommendations (it is a guess). It need to be on PIG.

Thanks, Er

fetnelio
  • 1
  • 1

1 Answers1

0

Try this link, it has lot of examples about how to load,store and process the SEQ files using elephantbird.

Ex:

     pair = LOAD '$data' USING com.twitter.elephantbird.pig.load.SequenceFileLoader (
       '-c com.twitter.elephantbird.pig.util.IntWritableConverter', 
       '-c com.twitter.elephantbird.pig.mahout.VectorWritableConverter'
     ) AS (key: int, val: (f1: double, f2: double, f3: double));

http://grepcode.com/file/repo1.maven.org/maven2/com.twitter.elephantbird/elephant-bird-mahout/3.0.1/com/twitter/elephantbird/pig/mahout/VectorWritableConverter.java

Sivasakthi Jayaraman
  • 4,724
  • 3
  • 17
  • 27