I have a V-Net to segment an organ from bunch of CT images. However, the training data varies in depth: some examples [Batch, Channels, D, H, W]:
[32, 1, 34, 512, 512]
[32, 1, 125, 512, 512]
[32, 1, 80, 512, 512]
The training dataloader requires the same dimensions to be passed into the network. I have tried to implemented a random spatial crop [128, 128, 64] for the dataloader transform, partly due to memory issues, but the results are all funny (I think most of the crop aren't within the desired organ of interest, so all outputs are just zeros). Any suggestions for a work around?
I was thinking maybe preprocessing the data first and crop depth-wise by looking at the labels to find the desired depth to crop at.