-1

I got this code for spectral clustering.

https://github.com/BirdYin/scllc/blob/master/scllc.py

This is a landmark-based spectral clustering code.

What does the locality_linear_coding function do in this code?

class Scllc:    
    def __locality_linear_coding(self, data, neighbors):
        indicator = np.ones([neighbors.shape[0], 1])
        penalty = np.eye(self.n_neighbors)

        # Get the weights of every neighbors
        z = neighbors - indicator.dot(data.reshape(-1,1).T)
        local_variance = z.dot(z.T)
        local_variance = local_variance + self.lambda_val * penalty
        weights = scipy.linalg.solve(local_variance, indicator)
      
        weights = weights / np.sum(weights)
        weights = weights / np.sum(np.abs(weights))
        weights = np.abs(weights)

        return weights.reshape(self.n_neighbors)
    
    def fit(self, X):
        [n_data, n_dim] = X.shape
        # Select landmarks
        if self.func_landmark == 'kmeans':
            landmarks, centers, unknown = k_means(X, self.n_landmarks, n_init=1, max_iter=100)
        nbrs = NearestNeighbors(metric='euclidean').fit(landmarks)
        
        # Create properties of the sparse matrix Z
        [dist, indy] = nbrs.kneighbors(X, n_neighbors = self.n_neighbors)
        indx = np.ones([n_data, self.n_neighbors]) * np.asarray(range(n_data))[:, None]
        valx = np.zeros([n_data, self.n_neighbors])
        self.delta = np.mean(valx)
        
        # Compute all the coded data 
        for index in range(n_data):
            # Compute the weights of its neighbors
            localmarks = landmarks[indy[index,:], :]
            weights = self.__locality_linear_coding(X[index,:], localmarks)
            # Compute the coded data
            valx[index] = weights
        
        # Construct sparse matrix 
        indx = indx.reshape(n_data * self.n_neighbors)
        indy = indy.reshape(n_data * self.n_neighbors)
        valx = valx.reshape(n_data * self.n_neighbors)

        Z = sparse.coo_matrix((valx,(indx,indy)),shape=(n_data,self.n_landmarks)) 
        Z = Z / np.sqrt(np.sum(Z, 0))

        # Get first k eigenvectors
        [U, Sigma, V] = svds(Z, k = self.n_clusters + 1)
        U = U[:, 0:self.n_clusters]
        embedded_data = U / np.sqrt(np.sum(U * U, 0))     

1 Answers1

0

You can see the documentation of numpy module to deal with n-dimensional array .For exemple, the dot method do the product of the matrices

Than They have use the scipy module, you can also see the documentation on internet.

the first function of a class is always an initialize method. Because the user have to call it to fully use the class. It is the first function where are defined and saved all the variables that the user want