1

I have two matrices {1,2,3;4,5,6;7,8,9} and {1,4;2,5;3,6}. The following code is a matrix multiplication in apache spark. But it is giving me wrong output as {15.0,29.0;36.0,71.0;57.0,113.0}. I want to know where I am done mistake?

    JavaRDD<String> lines = ctx
            .textFile(
                    "/home/hduser/Desktop/interpolation/Kriging/MatrixMultiplication/MatrixA.csv")
            .cache();

    JavaRDD<String> lines1 = ctx
            .textFile(
                    "/home/hduser/Desktop/interpolation/Kriging/MatrixMultiplication/MatrixB.csv")
            .cache();

    JavaRDD<Vector> rows = lines.map(new Function<String, Vector>() {

        @Override
        public Vector call(String line) throws Exception {

            String[] lineSplit = line.split(",");
            double[] arr = new double[lineSplit.length];
            for (int i = 0; i < lineSplit.length; i++) {
                arr[i] = Double.parseDouble(lineSplit[i]);
            }
            Vector dv = Vectors.dense(arr);
            return dv;
        }

    });

    //rows.saveAsTextFile("/home/hduser/Desktop/interpolation/Kriging/MatrixMultiplication/MatrixA_output");

    RowMatrix A = new RowMatrix(rows.rdd());



    JavaRDD<Vector> rows1 = lines1.map(new Function<String, Vector>() {

        @Override
        public Vector call(String line) throws Exception {

            String[] lineSplit = line.split(",");
            double[] arr = new double[lineSplit.length];
            for (int i = 0; i < lineSplit.length; i++) {
                arr[i] = Double.parseDouble(lineSplit[i]);
            }
            Vector dv = Vectors.dense(arr);
            return dv;
        }

    });

    List<Vector> arrList = new ArrayList<Vector>();
    arrList = rows1.toArray();


    double[] arr1 = new double[(int) rows1.count() * arrList.get(0).size()];
    int k=0;
    for (int i = 0; i < arrList.size(); i++) {
        for (int j = 0; j < arrList.get(i).size(); j++) {
            arr1[k] = arrList.get(i).apply(j);
            //System.out.println(arr1[k]);
            k++;
        }
    }
    Matrix B = Matrices.dense((int) rows1.count(), arrList.get(0)
            .size(), arr1);

    RowMatrix C = A.multiply(B);


    RDD<Vector> rows2 = C.rows();
    rows2.saveAsTextFile("/home/hduser/Desktop/interpolation/Kriging/MatrixMultiplication/Result");

Thanks in advance...

Chandan
  • 764
  • 2
  • 8
  • 21
  • If you run your code in a debugger, how does `arr1` look like? Please give a more self-contained example. The order of elements in your array may just be wrong. You can use an array literal in our example. – stholzm May 21 '15 at 06:56

1 Answers1

1

Matrices.dense constructs a column-major matrix (API doc), and you are traversing the array of rows in the wrong order.

I cannot look into your CSV files, but I guess you have a typo there as well. Why?

B has to be [1 3; 4 5; 2 6] in order to produce the wrong output, therefore the array has to be {1,4,2,3,5,6}, so MatrixB.csv probably contains:

1,4
2,3
5,6

(3 and 5 are switched)

stholzm
  • 3,395
  • 19
  • 31