How to understand this efficient implementation of PageRank calculation

Question

For reference, I'm using this page. I understand the original pagerank equation

but I'm failing to understand why the sparse-matrix implementation is correct. Below is their code reproduced:

def compute_PageRank(G, beta=0.85, epsilon=10**-4):
    '''
    Efficient computation of the PageRank values using a sparse adjacency 
    matrix and the iterative power method.

    Parameters
    ----------
    G : boolean adjacency matrix. np.bool8
        If the element j,i is True, means that there is a link from i to j.
    beta: 1-teleportation probability.
    epsilon: stop condition. Minimum allowed amount of change in the PageRanks
        between iterations.

    Returns
    -------
    output : tuple
        PageRank array normalized top one.
        Number of iterations.

    '''    
    #Test adjacency matrix is OK
    n,_ = G.shape
    assert(G.shape==(n,n))
    #Constants Speed-UP
    deg_out_beta = G.sum(axis=0).T/beta #vector
    #Initialize
    ranks = np.ones((n,1))/n #vector
    time = 0
    flag = True
    while flag:        
        time +=1
        with np.errstate(divide='ignore'): # Ignore division by 0 on ranks/deg_out_beta
            new_ranks = G.dot((ranks/deg_out_beta)) #vector
        #Leaked PageRank
        new_ranks += (1-new_ranks.sum())/n
        #Stop condition
        if np.linalg.norm(ranks-new_ranks,ord=1)<=epsilon:
            flag = False        
        ranks = new_ranks
    return(ranks, time)

To start, I'm trying to trace the code and understand how it relates to the PageRank equation. For the line under the with statement (new_ranks = G.dot((ranks/deg_out_beta))), this looks like the first part of the equation (the beta times M) BUT it seems to be ignoring all divide by zeros. I'm confused by this because the PageRank algorithm requires us to replace zero columns with ones (except along the diagonal). I'm not sure how this is accounted for here.

The next line new_ranks += (1-new_ranks.sum())/n is what I presume to be the second part of the equation. I can understand what this does, but I can't see how this translates to the original equation. I would've thought we would do something like new_ranks += (1-beta)*ranks.sum()/n.

score 0 · Accepted Answer · answered May 22 '20 at 06:13

0

This happens because in the row sums

e.T * M * r = e.T * r

by the column sum construction of M. The convex combination with coefficient beta has the effect that the sum over the new r vector is again 1. Now what the algorithm does is to take the first matrix-vector product b=beta*M*r and then find a constant c so that r_new = b+c*e has row sum one. In theory this should be the same as what the formula says, but in the floating point practice this approach corrects and prevents floating point error accumulation in the sum of r.

Computing it this way also allows to ignore zero columns, as the compensation for them is automatically computed.

answered May 22 '20 at 06:13

Lutz Lehmann

25,219
2
22
51

Thanks for the comment, but I'm still confused by your answer. I understand the relationship you highlighted, but I can't connect the rest of your comment. Could you elaborate more? – Kiwi breeder May 22 '20 at 06:35
You want to compute `c` so that in `r=b+c*e` you have the sum relation `e.T*r=1`. You can compute `c` from theory, or directly from this relation as `1=e.T*r+n*c`, that is, `c=(1-r.sum())/n`. With the first approach you can *expect* `c` to be correct, with the latter you *know* that `c` is correct and, as bonus, that floating point errors in the sum are minimized. – Lutz Lehmann May 22 '20 at 06:49
So why do I want to enforce that `e.T * r = 1`? Is this because I expect that `||r||_1 = 1 for r = Ax`? If so, is `c*e` some sort of correction vector? How do you know that the form of this correction is in the proportional of `e`? – Kiwi breeder May 22 '20 at 07:06
The aim is to apply power iteration on `A` to find the largest eigenvalue and its right eigenvector. By the construction of `A` one even knows that `1` is its maximal eigenvalue and that `e.T` is a left eigenvector. Part of the iteration is a normalization of the iteration vector by scaling, for instance to set the maximal component to 1 or to set the sum of components to 1, as is done here. The whole effort here is to avoid to compute `A` in its matrix elements, that is, to compute the matrix-vector product with `A` using the properties of the terms on the right side. – Lutz Lehmann May 22 '20 at 07:13

How to understand this efficient implementation of PageRank calculation

1 Answers1