0

I am new to CNNs and am building a model using Keras to combine inputs from multiple sources. Two of my sources have different dimensions and cannot be scaled by an integer number (i.e., x2 or x3 smaller). Therefore, simply max-pooling will not work. I am having trouble figuring out how to downsample the larger image. Here are the exact dimensions:

Image1: 7000 x 4000

Image2: 2607 x 1370

Is there a best practice for dealing with non-conventional downsampling?

I am applying a Conv2D layer and am thinking that combing the appropriately sized filter (1787x1261 with stride=1) with a max pooling (2x2 and stride=2) would give me the correct dimensions. Any reason why that is a bad idea? This does seem like a large filter compared to the total size of the image.

Somewhat related, would it be better to run the model on smaller chunks of the full image? That way I could control the size of each chunk?

  • 1
    It depends on what you are trying to predict (classification vs. segmentation), but why not just resize the input to an appropriate size as a pre-processing step? I see that approach often, especially when using pre-trained networks. – warpri81 Aug 21 '18 at 21:47
  • Actually, I am trying to do regression. Re-sizing is as a pre-processing step is a good idea. Do you have any suggestions? Like scypy's imresize or cv2's resize or some other function? – Jeff Lapierre Aug 22 '18 at 01:57
  • Either will work just fine. I don't know how much detail is in these high resolution images, but I would suggest downsampling them significantly to reduce training time. I would start with 128x128 or 512x512 at the most unless that is going to lose significant information. – warpri81 Aug 22 '18 at 15:20
  • The input images are meteorological images of the US, so reducing them in size that much would lose significant resolution (trying to keep 2 km resolution). Should I split this up into chunks of data (e.g., 100x100 km chunks)? Doing that might affect the spatial continuity of the data, though. – Jeff Lapierre Aug 22 '18 at 15:52
  • 1
    I don't know what information you are trying to regress, but I would recommend something along those lines. I would divide each image up a few different ways to increase your training data and minimize the effect of something you are trying to detect being split between two images. – warpri81 Aug 23 '18 at 17:01
  • Basically, I am trying to use satellite images and lightning density maps to produce synthetic radar. Do you think dividing up the images so that they overlap (~30% overlap) would work, or do you think the redundant information would skew the results? – Jeff Lapierre Aug 24 '18 at 19:07
  • 1
    No I think 30% overlap would actually enhance the training data set. To augment image data for more training examples, it is common to shift, zoom, flip, and shear training images. Sounds like an interesting project! – warpri81 Aug 25 '18 at 20:05

0 Answers0