I have 6464 orginal image dataset and 3232 low_resoluation image that I want to use for high resoluation model with monai package. My problem is that when I run diffusion model for training I got this error:
"Sizes of tensors must match except in dimension 1. Expected size 16 but got size 32 for tensor number 1 in the list"
When I checked my code I find that my latent space is torch.Size([4, 3, 16, 16])
which differ from torch.Size([4, 1, 32, 32])
low_res_image.
So when use:
latent_model_input = torch.cat([noisy_latent, noisy_low_res_image], dim=1)
I got this error:
"Sizes of tensors must match except in dimension 1. Expected size 16 but got size 32 for tensor number 1 in the list"
I use auoencoderkl monai but i dont know how to change my latent space dim from [4, 3, 16, 16] to [4, 1, 32, 32]?? my autoencodekl
autoencoderkl = AutoencoderKL(
spatial_dims=2, #number of spatial dimensions (1D, 2D, 3D)
in_channels=1, #number of input channels
out_channels=1, #number of output channels.
num_channels=(256, 512, 512), #sequence of block output channels.
latent_channels=3, #latent embedding dimension.
num_res_blocks=2, #number of residual blocks (see ResBlock) per level
norm_num_groups=32, #number of groups for the GroupNorm layers, num_channels must be divisible by this number
attention_levels=(False, False, True), #sequence of levels to add attention
)
autoencoderkl = autoencoderkl.to(device)
discriminator = PatchDiscriminator(spatial_dims=2, in_channels=1, num_layers_d=2, num_channels=32)
discriminator = discriminator.to(device) # device enables you to specify the device type responsible to load a tensor into memory.
I need to change my latent dim from 16 to 32?