I have some gray scale and color images with label. I want to combine this gray and color images (4-channel) and run transfer learning using 4-channel images. How to do that?
-
Does this answer your question? [How to use torch.stack function](https://stackoverflow.com/questions/52288635/how-to-use-torch-stack-function) – Andreas K. Jan 07 '20 at 10:29
2 Answers
If I understand the question correctly you want to combine 1 channel images and 3 channel images and get a 4 channel image and use this as your input.
If this is what you want to do you can just use torch.cat().
Some example code of loading two images and combining them along the channel dimension
import numpy as np
import torch
from PIL import Image
image_rgb = Image.open(path_to_rgb_image)
image_rgb_tensor = torch.from_numpy(np.array(image_rgb))
image_rgb.close()
image_grayscale = Image.open(path_to_grayscale_image))
image_grayscale_tensor = troch.from_numpy(np.array(image_grayscale))
image_grayscale.close()
image_input = torch.cat([image_rgb_tensor, image_grayscale_tensor], dim=2)
I assumed that the grayscale image you want to use translated to a tensor with the shape [..., ..., 1]
and the rgb image to [..., ..., 3]
.

- 1,976
- 7
- 17
your current model expects an RGB input with only three channels, thus its first conv layer has in_channels=3
and the shape of this first layer's weight
is out_channels
x3xkernel_height
xkernel_width
.
In order to accommodate 4 channel input, you need to change the first layer to have in_channels=4
and a weight
of shape out_channels
x4xkernel_height
xkernel_width
. You also want to preserve the learned weights, so you should initialize the new weight
to be the same as the old except for tiny noise in the added weights.

- 111,146
- 38
- 238
- 371
-
Thanks for your answer. But my question is how I will combine gray and color image in dataloader? – Sampa Misra Jan 07 '20 at 07:04