0

I've been working for the past 4 months with a partner on a machine learning project using the CelebA dataset.

Our next step in the project is to create a Conditional DCGAN that would receive 40 labels for each class in the dataset, as either a 0 for a non-existing facial attribute and a 1 for an existing facial attribute, like this: [1, 0, 0, 1, ..., 1, 0] -> length of list is 40,

and generate from it a human face accordingly. So if the set of 40 labels matches a young male with sunglasses, we'll expect the generator to synthesize for us a young-male-with-sunglasses-ish persona (with variation from generation to generation because we feed it random noise each time).

I'm going to use Google Colab Pro's premium GPU option to train the model, if it matters.

I've didn't want to reinvent the wheel and begin working from scratch, so I copied some working code for a CelebA, image generating, non-conditional DCGAN from this article, and tried to tweak it to be a Conditional GAN.

I got stuck in the training loop section due to a tensor size mismatch.

# Training Loop

# Lists to keep track of progress
img_list = []
G_losses = []
D_losses = []
iters = 0

print("Starting Training Loop...")
# For each epoch
for epoch in range(num_epochs):
    # For each batch in the dataloader
    for i, (real_cpu, conditions) in enumerate(dataloader, 0):
        ############################
        # (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
        ###########################
        ## Train with all-real batch
        netD.zero_grad()

        # Format batch
        b_size = real_cpu.size(0)

        label = torch.full((b_size, 1, 1, 1), real_label, device=device, dtype=torch.float32)
        
        # Repeat the conditions tensor along the spatial dimensions
        conditions = conditions.view(-1, 40, 1, 1)
        conditions = conditions.repeat(1, 1, image_size, image_size)

        # Forward pass real batch through D
        output = netD(real_cpu.to(device), conditions.to(device))

        label = torch.full((b_size, 1, 1, 1), real_label, device=device, dtype=torch.float32)

        # Calculate loss on all-real batch
        errD_real = criterion(output, label)

        # Calculate gradients for D in backward pass
        errD_real.backward()
        D_x = output.mean().item()

        ## Train with all-fake batch
        # Generate batch of latent vectors
        noise = torch.randn(b_size, nz, 1, 1, device=device)

        # Generate fake image batch with G

        fake = netG(noise, conditions.to(device)) # <-- ERROR HERE
        label.fill_(fake_label)

        # Classify all fake batch with D
        output = netD(fake.detach(), conditions.to(device))
        # Calculate D's loss on the all-fake batch
        errD_fake = criterion(output, label)
        # Calculate the gradients for this batch, accumulated (summed) with previous gradients
        errD_fake.backward()
        D_G_z1 = output.mean().item()
        # Compute error of D as sum over the fake and the real batches
        errD = errD_real + errD_fake
        # Update D
        optimizerD.step()

        ############################
        # (2) Update G network: maximize log(D(G(z)))
        ###########################
        netG.zero_grad()
        label.fill_(real_label)  # fake labels are real for generator cost
        # Since we just updated D, perform another forward pass of all-fake batch through D
        output = netD(fake, conditions.to(device))
        # Calculate G's loss based on this output
        errG = criterion(output, label)
        # Calculate gradients for G
        errG.backward()
        D_G_z2 = output.mean().item()
        # Update G
        optimizerG.step()

        # Output training stats
        if i % 50 == 0:
          print('[%d/%d][%d/%d]\tLoss_D: %.4f\tLoss_G: %.4f\tD(x): %.4f\tD(G(z)): %.4f / %.4f'
              % (epoch, num_epochs, i, len(dataloader),
                 errD.item(), errG.item(), D_x, D_G_z1, D_G_z2))

        # Save Losses for plotting later
        G_losses.append(errG.item())
        D_losses.append(errD.item())

        # Check how the generator is doing by saving G's output on fixed_noise
        if (iters % 500 == 0) or ((epoch == num_epochs-1) and (i == len(dataloader)-1)):
          with torch.no_grad():
            fake = netG(fixed_noise).detach().cpu()
          img_list.append(vutils.make_grid(fake, padding=2, normalize=True))

    iters += 1

The error is in this line:

fake = netG(noise, conditions.to(device))

When I try to concatenate the two tensors in the generator's forward function:

def forward(self, input, label):
# Concatenate the input noise and label
    input = torch.cat((input, label), 1)
    return self.conv(input)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 1 but got size 64 for tensor number 1 in the list.

since:

noise.shape = torch.Size([64, 100, 1, 1])

and

conditions.shape = torch.Size([64, 40, 64, 64])

I tried to reshape conditions with these lines, before feeding the generator with it as its second parameter:

conditions = conditions.permute(0, 2, 3, 1)
conditions = conditions.view(batch_size, 1, 1, 40)

which didn't work:

RuntimeError: shape '[64, 1, 1, 40]' is invalid for input of size 10485760

Thanks in advance for any suggestions/ideas on how to change the code. I really feel like I'm missing something really fundamental here that's preventing the training loop from working properly.

  • Please trim your code to make it easier to find your problem. Follow these guidelines to create a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example). – Сергей Кох Feb 19 '23 at 10:41
  • @СергейКох Thanks, however should I try and repost? This question is probably dead already... – RaphDaPingu Feb 19 '23 at 16:27

0 Answers0