0

I am using a PyTorch code to train over a custom loss function in an unsupervised setting.However, the loss doesn't go down and stays the same over may epochs during the training phase. Please see the training code snippet below:

X = np.load(<data path>) #Load dataset which is a numpy array of N points with some dimension each.
num_samples, num_features = X.shape

gmm = GaussianMixture(n_components=num_classes, covariance_type='spherical')
gmm.fit(X)
z_gmm = gmm.predict(X)

R_gmm = gmm.predict_proba(X)
pre_R = Variable(torch.log(torch.from_numpy(R_gmm + 1e-8)).type(dtype), requires_grad=True)
R = torch.nn.functional.softmax(pre_R)

F = torch.stack(Variable(torch.from_numpy(X).type(dtype), requires_grad=True))
U = Variable(torch.from_numpy(gmm.means_).type(dtype), requires_grad=False)

z_pred = torch.max(R, 1)[1]

distances = torch.sum(((F.unsqueeze(1) - U) ** 2), dim=2)
custom_loss = torch.sum(R * distances) / num_samples

learning_rate = 1e-3
opt_train= torch.optim.Adam([train_var], lr = learning_rate)
U = torch.div(torch.mm(torch.t(R), F), torch.sum(R, dim=0).unsqueeze(1)) #In place assignment with a formula over variables and hence no gradient update is needed.

for epoch in range(max_epochs+1):
    running_loss = 0.0
    for i in range(stepSize):
    # zero the parameter gradients
    opt_train.zero_grad()

    # forward + backward + optimize
    loss = custom_loss
    loss.backward(retain_graph=True)
    opt_train.step()
    running_loss += loss.data[0]

if epoch % 25 == 0:
    print(epoch, loss.data[0]) # OR running_loss also gives the same values.
    running_loss = 0.0

O/P: 0 5.8993988037109375 25 5.8993988037109375 50 5.8993988037109375 75 5.8993988037109375 100 5.8993988037109375

Am I missing something in the training? I followed this example/tutorial. Any help and pointers in this regard will be much appreciated.

Ioannis Nasios
  • 8,292
  • 4
  • 33
  • 55
Vishal
  • 227
  • 1
  • 2
  • 8

1 Answers1

0

Try this structure for the custom loss function and make the necessary changes. Use this loss function by writing this statement in the code :

criterion = Custom_Loss()

Here I show a custom loss called Custom_Loss which takes as input 2 kinds of input x and y. Then it reshapes x to be similar to y and finally returns the loss by calculating L2 difference between reshaped x and y. This is a standard thing you'll run across very often in training networks.

Consider x to be shape (5,10) and y to be shaped (5,5,10). So, we need to add a dimension to x, then repeat it along the added dimension to match the dimension of y. Then, (x-y) will be the shape (5,5,10). We will have to add over all three dimensions i.e. three torch.sum() to get a scalar.

    class Custom_Loss(torch.nn.Module):
    
    def __init__(self):
        super(Regress_Loss,self).__init__()
        
    def forward(self,x,y):
        y_shape = y.size()[1]
        x_added_dim = x.unsqueeze(1)
        x_stacked_along_dimension1 = x_added_dim.repeat(1,NUM_WORDS,1)
        diff = torch.sum((y - x_stacked_along_dimension1)**2,2)
        totloss = torch.sum(torch.sum(torch.sum(diff)))
        return totloss