0

Since 3D convolution requires too much computational cost, so I prefer to use 2D conv. My motivation here is using 2D conv for volumetric images to reduce this cost.

I want to apply 2D convolution along three orthogonals to get 3 results, each belongs to one of these orthogonals. More clearly, suppose I have a 3D volumetric image. Instead of apply 3D conv, I want to use 2D conv both xy, xz, yz axis. Then, I expect that 3 different volumetric results. Each result represent three different orthogonals.

Is there way to do that? Thanks for help.

vaveila
  • 73
  • 4

1 Answers1

1

You can permute your images. (Some frameworks such as numpy calls it transpose).

Assume we use 3 x 3 a convolutional kernel.

# A batch of 16 3 channel images (channels first)
a = tensor(shape=[16,3,1920,1080])

# 2D conv will slide over a `1920 x 1080` image, kernel size is `3 x 3 x 3`
a.shape is (16,3,1920,1080)

# 2D conv will slide over a `3 x 1080` image, kernel size is `1920 x 3 x 3`
a.permute(0,2,1,3)
a.shape is (16,1920,3,1080)

# 2D conv will slide over a `1920 x 3` image, kernel size is `1080 x 3 x 3`
a.permute(0,3,2,1)
a.shape is (16,1080,1920,3)
Naphat Amundsen
  • 1,519
  • 1
  • 6
  • 17
  • What if I have tensor with 5 element initially (batch, channel, depth, height, width)? Let' say my initial tensor shape is : [1,1,5,192,192], then what should be the input of my 2d conv parameters? I will use the PyTorch framework torch.nn.Conv2d(in_channels, out_channels, kernel_size,...) – vaveila May 29 '21 at 11:19
  • 1
    The general idea is the same, you just have to accommodate for another dimension, and permute four times. However, it is stated in PyTorch's documentation that `Conv2d` inputs tensors of shape `(N,C,H,W)`, so I don't think it can handle your extra dimension natively. – Naphat Amundsen May 29 '21 at 11:25
  • your initial response is no nice, thank you. But I have a confusion about shapes. 1) your initial height-width is 32x32. Then how it becomes 1080 and 1920. What causese this? – vaveila May 29 '21 at 11:53
  • Oh! that's a mistake, my bad. I initially tried with `32 x 32` but changed it to `1920 x 1080` midway to have differently sized axes. It is supposed to start with `a = tensor(shape=[16,3,1920,1080])`. I will edit my post! – Naphat Amundsen May 29 '21 at 12:11
  • 1
    Okey then, I will try and inform you with my results. Thanks for your time. – vaveila May 29 '21 at 12:18
  • I appreciate your answer and applied in my network. However, I think, heigh and width (like 1080,1920) may be too large for kernel size. Is there any problem if I use (3x3x3) for each orthogonal? thank you – vaveila May 30 '21 at 13:56
  • I just used `1920 x 1080` resolution images as an example, you can use whatever you want. The kernel features need to match the image features though. E.g. If you want to convolve over a `123 x 32 x 16` image, the convolution kernel must also have `123` features. In the case of a `3 x 3` kernel, the kernel shape would be `123 x 3 x 3`. – Naphat Amundsen May 30 '21 at 14:50