1

I am trying to perform matrix multiplication in pig latin. Here's my attempt so far:

matrix1 = LOAD 'mat1' AS (row,col,value);
matrix2 = LOAD 'mat2' AS (row,col,value);

mult_mat = COGROUP matrix1 BY row, matrix2 BY col;
mult_mat = FOREACH mult_mat {
    A = COGROUP matrix1 BY col, matrix2 BY row;
    B = FOREACH A GENERATE group AS col, matrix1.value*matrix2.value AS prod;
    GENERATE group AS row, B.col AS col, SUM(B.prod) AS value;}

However, this doesn't work. I get stopped at

A = COGROUP matrix1...

with

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <line 14, column 37>  mismatched input 'matrix1' expecting LEFT_PAREN
Fortunato
  • 567
  • 6
  • 18
  • 1
    COGROUP is not a valid operator in nested foreach ref : https://pig.apache.org/docs/r0.7.0/piglatin_ref2.html#FOREACH. Extract : Allowed operations are DISTINCT, FILTER, LIMIT, ORDER and SAMPLE. – Murali Rao Oct 06 '15 at 18:22

1 Answers1

4

After some playing around, I figured it out:

matrix1 = LOAD 'mat1' AS (row,col,value);
matrix2 = LOAD 'mat2' AS (row,col,value);

A = JOIN matrix1 BY column FULL OUTER, matrix2 BY row;

B = FOREACH A GENERATE matrix1::row AS m1r, matrix2::column AS m2c, (matrix1::value)*(matrix2::value) AS value;

C = GROUP B BY (m1r, m2c);

multiplied_matrices = FOREACH C GENERATE group.$0 as row, group.$1 as column, SUM(B.value) AS val;

Multiplied matrices should return the product of matrix1*matrix2 in the same format that the 2 matrices were entered, (row, col, value).

Fortunato
  • 567
  • 6
  • 18