How can I use 2 images as a training sample in PyTorch？

Question

I just begin learning deep learning and my first homework is to finish an leaves-classification system based on convolutional neural networks.I built a resnet-34 model with the code on github to do it.However,my teacher told me that the basic training unit in his dataset is an image pair.I should use 2 images(photos of the same leaf under different light conditions) as the input,combining two 3-channel images into one 6-channel image,but I don't know how to input 2 images and combine them into 6 channels.How can I do that?Are there any functions?Should I modify the structure of the resnet network? enter image description here
this is my dataset,you can see every two images are about the same leaf.

You can just concatenate the two images on the last axis. Most images are in the format of (w, h, channels) when converted to a numpy array so you can just concatenate the two arrays on the channels' axis. Or, you can add two input layers on your model and pass them to a Concatenate layer to combine their last axis, works either way. — Seraph Wedd, Apr 13 '22 at 05:31

score 1 · Answer 1 · answered Apr 13 '22 at 05:32

1

You have several issues to tackle:

You need a Dataset with a __getitem__ method that returns 2 images (and a label) instead of the basic ones that returns a single image and a label. You'll probably need to customize your own dataset.
Make sure the augmentations you apply to your images are applied in the same manner to each pair.
You need to modify ResNet-34 network to get as an input 2 images, instead of one. See, e.g., this answer how that can be done.
You need to change the first convolution layer to have 6 input channels instead of 3.
If you want to use pre-trained weights you will not be able to load the existing state_dict of ResNet34 because of changes #3 and #4 - you'll have to do it manually for the first time.

answered Apr 13 '22 at 05:32

Shai

111,146
38
238
371

Thank you very much for your nice answer~ Can you tell me something more about #5? Can I load the existing resnet-34 pre-trained pth files on the github?Or I should train the network totally by myself,without any pre-trained parameters and weights? Looking forward to your reply~ – Coding Rookie Apr 13 '22 at 11:24
Thank you again for the answer.I solved my problem with your solution.By the way,I want to know what I should do if I want to get the same effect with VGG-16 model? – Coding Rookie Apr 19 '22 at 16:17
@CodingRookie you will need to adapt `VGG` in the same manner you adapted `ResNet`. – Shai Apr 20 '22 at 06:31

How can I use 2 images as a training sample in PyTorch？

1 Answers1