NaN by Matrix Factorization

Question

I implemented the matrix factorization using the SGD algorithm but I get frequently the NaN in the predicted matrix when I run it. When I run the algorithm on a very tiny (6 x 7) matrix, the number of times that the error appears is small. As I have moved to the Movie Lens data set I'm getting the error in all cells every time I run the algorithm. The only time that the error disappears in only some of the cells is when I set the optimization steps (no. of iterations) at 1.

    private static Matrix matrixFactorizationLarge (Matrix realRatingMatrix, Matrix factor_1, Matrix factor_2)
    {
        int features = (int) factor_1.getColumnCount();
        double learningRate = 0.02;
        double regularization = 0.02;
        int optimizationSteps = 10;
        Matrix predictedRatingMatrix = SparseMatrix.Factory.zeros(realRatingMatrix.getRowCount(), realRatingMatrix.getColumnCount());

        for (int step = 0; step < optimizationSteps; step++)
        {   
            for (int row = 0; row < predictedRatingMatrix.getRowCount(); row++)
            {
                for (int col = 0; col < predictedRatingMatrix.getColumnCount(); col++)
                {
                    if (realRatingMatrix.getAsInt(row, col) > 0)
                    {
                        Matrix vector_1 = getRow(factor_1, row);
                        Matrix vector_2 = getColumn(factor_2, col);
                        predictedRatingMatrix.setAsDouble( ( Math.floor ( dotProduct(vector_1, vector_2) * 100 ) ) / 100, row, col);

                        for (int f = 0; f < features; f++)
                        {
                            factor_1.setAsDouble( ( Math.floor ( ( factor_1.getAsDouble(row, f) + ( learningRate * ( ( calculateDerivative(realRatingMatrix.getAsDouble(row, col), predictedRatingMatrix.getAsDouble(row, col), factor_2.getAsDouble(f, col) ) ) - ( regularization * factor_1.getAsDouble(row, f) ) ) ) ) * 100 ) / 100), row, f); 

                            factor_2.setAsDouble( ( Math.floor ( ( factor_2.getAsDouble(f, col) + ( learningRate * ( ( calculateDerivative(realRatingMatrix.getAsDouble(row, col), predictedRatingMatrix.getAsDouble(row, col), factor_1.getAsDouble(row, f) ) ) - ( regularization * factor_2.getAsDouble(f, col) ) ) ) ) * 100 ) / 100), f, col); 
                        }
                    }
                }
            }
        }

        return predictedRatingMatrix;
    }

The related methods are as follows:


    private static double dotProduct (Matrix vector_A, Matrix vector_B)
    {
        double dotProduct = 0.0;

        for (int index = 0; index < vector_A.getColumnCount(); index++)
        {
            dotProduct =  dotProduct + ( vector_A.getAsDouble(0, index) * vector_B.getAsDouble(0, index) );
        }

        return dotProduct;
    }

    private static double errorOfDotProduct (double original, double dotProduct)
    {
        double error = 0.0;

        error = Math.pow( ( original - dotProduct ), 2 );

        return error;
    }

    private static double calculateDerivative(double realValue, double predictedValue, double value)
    {
        return ( 2 * (realValue - predictedValue) * (value) );
    }

    private static double calculateRMSE (Matrix realRatingMatrix, Matrix predictedRatingMatrix)
    {
        double rmse = 0.0;
        double summation = 0.0;

        for (int row = 0; row < realRatingMatrix.getRowCount(); row++)
        {
            for (int col = 0; col < realRatingMatrix.getColumnCount(); col++)
            {
                if (realRatingMatrix.getAsDouble(row, col) != 0)
                {
                    summation = summation + errorOfDotProduct(realRatingMatrix.getAsDouble(row, col), predictedRatingMatrix.getAsDouble(row, col));
                }
            }
        }

        rmse = Math.sqrt(summation);

        return rmse;
    }

    private static Matrix csvToMatrixLarge (File csvFile) 
    {

        Scanner inputStream;
        Matrix realRatingMatrix = SparseMatrix.Factory.zeros(610, 17000);
//      Matrix realRatingMatrix = SparseMatrix.Factory.zeros(6, 7);

        try     
        {
            inputStream = new Scanner(csvFile);

            while (inputStream.hasNext()) {
                String ln = inputStream.next();
                String[] values = ln.split(",");

                double rating = Double.parseDouble(values[2]);
                int row = Integer.parseInt(values[0])-1;
                int col = Integer.parseInt(values[1])-1;

                if (col < 1000)
                {
                    realRatingMatrix.setAsDouble(rating, row, col);
                }
            }

            inputStream.close();
        } 

        catch (FileNotFoundException e) 
        {
            e.printStackTrace();
        }

        return realRatingMatrix;
    }

    private static Matrix createFactorLarge (long rows, long features)
    {
        Matrix factor = DenseMatrix.Factory.zeros(rows, features);

        return factor;
    }

    private static void fillInMatrixLarge (Matrix matrix)
    {
        for (int row = 0; row < matrix.getRowCount() ; row++)
        {
            for (int col = 0; col < matrix.getColumnCount(); col++)
            {
                double random = ThreadLocalRandom.current().nextDouble(5.1);
                matrix.setAsDouble( (Math.floor (random * 10 ) / 10), row, col);
            }
        }

//      return matrix;
    }

    private static Matrix getRow (Matrix matrix, int rowOfIntresst)
    {
        Matrix row = Matrix.Factory.zeros(1, matrix.getColumnCount());

        for (int col = 0; col < matrix.getColumnCount(); col++)
        {
            row.setAsDouble(matrix.getAsDouble(rowOfIntresst, col), 0, col);
        }

        return row;
    }

    private static Matrix getColumn (Matrix matrix, int colOfInteresst)
    {
        Matrix column = Matrix.Factory.zeros(1, matrix.getRowCount());

        for (int index = 0; index < matrix.getRowCount(); index++)
        {
            column.setAsDouble(matrix.getAsDouble(index, colOfInteresst), 0, index);   //column[row] = matrix[row][colOfInteresst];

        }

        return column;
    }

What is causing the error as I don't divide with zero in the algorithm? And How can I solve it?

P.S. I'm using the Universal Matrix Library Package

When you **debugged** the code, at what point did the `NaN` value appear? You did debug the code, right? — Andreas, Jan 29 '20 at 07:31
It's hard to see exactly what you're trying to do on my phone... But just a thought: are any of the rows or columns of the matrix all zero? — Andy Turner, Jan 29 '20 at 07:33
[How to create a Minimal, Reproducible Example](https://stackoverflow.com/help/minimal-reproducible-example) --- *Reproducible* includes providing the **data** needed to reproduce the problem. — Andreas, Jan 29 '20 at 07:33
*"What is causing the error as I don't divide with zero in the algorithm?"* With `double` math, division by zero does not result in `NaN` value, it results in `Infinity` or `-Infinity`, except that `0d / 0d` causes `ArithmeticException: / by zero` and that `NaN / 0` results in `NaN`. As you can see, division never *causes* `NaN`, so look elsewhere for source of `NaN`. — Andreas, Jan 29 '20 at 07:37
Potential causes of `NaN` in your code: [`Math.pow()`](https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#pow-double-double-) and [`Math.sqrt()`](https://docs.oracle.com/javase/8/docs/api/java/lang/Math.html#sqrt-double-), which you could have found out for yourself, if you'd read the javadoc of the methods. — Andreas, Jan 29 '20 at 07:43

score 1 · Answer 1 · answered May 29 '20 at 10:02

The key to avoid the Not a Number - NaN - error in matrix factorization is to choose the right learning rate. It is important to notice that the right learning rate is always determined in relation to the number of iterations. Below is an example which clarifies the problem:

No. Of Iterations: 3
Learning Rate: 0.02
Regularization Rate: 0.02

We have the following factors as an example in iteration 1 before optimization:

Predicted Rating, row 4 col 2 : ( 4.96 * 1.26 ) + ( 4.9 * 2.25 ) = 17.27

After optimizing the factors we will get:

The row 4 and column 2 get optimized until we come back to them in iteration 2:

Predicted Rating, row 4 col 2: ( -2.31 * 233089.24 ) + ( -1.67 * -888.59 ) = -536952.2

Each cell in row 4 and column 2 get optimized. I will show the optimization steps for row 1 column 1:

-2.31 + 0.02 [ ( 2 ( 4 + 536952.2 ) ( 233089.24 ) ) - ( 0.02 * -2.31 ) ] =
-2.31 + 0.02 [ ( 2 * 536956.2 * 233089.24 ) - ( 0.02 * -2.31 ) ] =
-2.31 + 0.02 [ ( 250317425142.57 ) - ( 0.04 ) ] =

As we can see, at this step we get a very big number for the derivative. The key point here is to choose the right learning rate. The learning rate determines the rate of approaching the minimum. If we make it too large we may miss the minimum by jumping over it and diverging to infinity.

-2.31 + 0.02 [ 250317425142,53 ] =
-2.31 + 5006348502,85 =
5006348500,54

As the optimization continues we will get Infinity for this cell in the next iteration which leads to NaN error when it is added with a number.

By choosing a small learning rate we will avoid the error and swiftly reach the minimum point.

NaN by Matrix Factorization

1 Answers1