I've been working for the past 4 months with a partner on a machine learning project using the CelebA dataset.
Our next step in the project is to create a Conditional DCGAN that would receive 40 labels for each class in the dataset, as either a 0 for a non-existing facial attribute and a 1 for an existing facial attribute, like this: [1, 0, 0, 1, ..., 1, 0] -> length of list is 40,
and generate from it a human face accordingly. So if the set of 40 labels matches a young male with sunglasses, we'll expect the generator to synthesize for us a young-male-with-sunglasses-ish persona (with variation from generation to generation because we feed it random noise each time).
I'm going to use Google Colab Pro's premium GPU option to train the model, if it matters.
I've didn't want to reinvent the wheel and begin working from scratch, so I copied some working code for a CelebA, image generating, non-conditional DCGAN from this article, and tried to tweak it to be a Conditional GAN.
I got stuck in the training loop section due to a tensor size mismatch.
# Training Loop
# Lists to keep track of progress
img_list = []
G_losses = []
D_losses = []
iters = 0
print("Starting Training Loop...")
# For each epoch
for epoch in range(num_epochs):
# For each batch in the dataloader
for i, (real_cpu, conditions) in enumerate(dataloader, 0):
############################
# (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
###########################
## Train with all-real batch
netD.zero_grad()
# Format batch
b_size = real_cpu.size(0)
label = torch.full((b_size, 1, 1, 1), real_label, device=device, dtype=torch.float32)
# Repeat the conditions tensor along the spatial dimensions
conditions = conditions.view(-1, 40, 1, 1)
conditions = conditions.repeat(1, 1, image_size, image_size)
# Forward pass real batch through D
output = netD(real_cpu.to(device), conditions.to(device))
label = torch.full((b_size, 1, 1, 1), real_label, device=device, dtype=torch.float32)
# Calculate loss on all-real batch
errD_real = criterion(output, label)
# Calculate gradients for D in backward pass
errD_real.backward()
D_x = output.mean().item()
## Train with all-fake batch
# Generate batch of latent vectors
noise = torch.randn(b_size, nz, 1, 1, device=device)
# Generate fake image batch with G
fake = netG(noise, conditions.to(device)) # <-- ERROR HERE
label.fill_(fake_label)
# Classify all fake batch with D
output = netD(fake.detach(), conditions.to(device))
# Calculate D's loss on the all-fake batch
errD_fake = criterion(output, label)
# Calculate the gradients for this batch, accumulated (summed) with previous gradients
errD_fake.backward()
D_G_z1 = output.mean().item()
# Compute error of D as sum over the fake and the real batches
errD = errD_real + errD_fake
# Update D
optimizerD.step()
############################
# (2) Update G network: maximize log(D(G(z)))
###########################
netG.zero_grad()
label.fill_(real_label) # fake labels are real for generator cost
# Since we just updated D, perform another forward pass of all-fake batch through D
output = netD(fake, conditions.to(device))
# Calculate G's loss based on this output
errG = criterion(output, label)
# Calculate gradients for G
errG.backward()
D_G_z2 = output.mean().item()
# Update G
optimizerG.step()
# Output training stats
if i % 50 == 0:
print('[%d/%d][%d/%d]\tLoss_D: %.4f\tLoss_G: %.4f\tD(x): %.4f\tD(G(z)): %.4f / %.4f'
% (epoch, num_epochs, i, len(dataloader),
errD.item(), errG.item(), D_x, D_G_z1, D_G_z2))
# Save Losses for plotting later
G_losses.append(errG.item())
D_losses.append(errD.item())
# Check how the generator is doing by saving G's output on fixed_noise
if (iters % 500 == 0) or ((epoch == num_epochs-1) and (i == len(dataloader)-1)):
with torch.no_grad():
fake = netG(fixed_noise).detach().cpu()
img_list.append(vutils.make_grid(fake, padding=2, normalize=True))
iters += 1
The error is in this line:
fake = netG(noise, conditions.to(device))
When I try to concatenate the two tensors in the generator's forward function:
def forward(self, input, label):
# Concatenate the input noise and label
input = torch.cat((input, label), 1)
return self.conv(input)
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 1 but got size 64 for tensor number 1 in the list.
since:
noise.shape = torch.Size([64, 100, 1, 1])
and
conditions.shape = torch.Size([64, 40, 64, 64])
I tried to reshape conditions
with these lines, before feeding the generator with it as its second parameter:
conditions = conditions.permute(0, 2, 3, 1)
conditions = conditions.view(batch_size, 1, 1, 40)
which didn't work:
RuntimeError: shape '[64, 1, 1, 40]' is invalid for input of size 10485760
Thanks in advance for any suggestions/ideas on how to change the code. I really feel like I'm missing something really fundamental here that's preventing the training loop from working properly.