How the deep network accepts images of different scales in object detection?

Question

The network built with MatConvNet accepts images of different scales and evaluates it. For example :-

%img is an image of size 730*860*3
%net is loaded DagNN obj
scales = [-2 -1 0 0.5 1]
for s = 2.^scales
    img = imresize(raw_img, s, 'bilinear');
    img = bsxfun(@minus, img, averageImage);
    inputs = {'data', img};
    net.eval(inputs);
end

At the time of debugging, I found img resized and evaluated every iteration of the loop. But the network(net) was supposed to accept fixed image. As -

K>> net

net = 

  DagNN with properties:

                 layers: [1x319 struct]
                   vars: [1x323 struct]
                 params: [1x381 struct]
                   meta: [1x1 struct]
                      m: []
                      v: []
                   mode: 'test'
                 holdOn: 0
    accumulateParamDers: 0
         conserveMemory: 1
        parameterServer: []
                 device: 'cpu'

After loading trained network :-

K>> net.vars(1, 1).value

ans =

     []

And inside the for loop :-(iter 1)

K>> net.vars(1, 1).value

ans =

     [64 64 3]

(iter 2)

K>> net.vars(1, 1).value

ans =

     [160 160 3]

and so on.... So how the DagNN is handling such input and evaluates itself?(I am new to MatConvNet and couldn't find any help in the documentation. So please answer this question and suggest how to build such things in keras)

score 0 · Accepted Answer · answered Jun 11 '17 at 21:34

0

In general, ConvNet does not care about the input size of an image. All the layers are performing convolution-like operations (e.g, even the poolings behave like convolution spatially). If you provide large input, you get large output. The only thing that cares about the input size is the loss layer. If you don't have a loss layer, the code wouldn't break at all. There is no such thing as fully connected layer in MatConvNet, everything is convolutional.

BTW, that's why some people who work ConvNet early think that FCN is a funny name, because there is really no difference between a fully connected layer and a convolutional layer.

answered Jun 11 '17 at 21:34

DataHungry

351
2
9

@UjjalKumarDas then it's a problem of Keras. In MatConvNet you don't need to stick with a fixed size input – DataHungry Jun 12 '17 at 03:30
Thank you for answering. As far as I know, I implemented CNN in keras(python) and I was supposed to provide a fixed sized image. But here things are different. I request you to have a quick look at [this](http://ethereon.github.io/netscope/#/gist/8a0d5ef37da9dc4cd611d178404b3641). Here the network is defined to accept fixed sized data. But at running time it accepted images of different sizes which confused me. _By the way, is there any way that I can implement the same in python?_ Looking for your answer.and thanks once again. – Ujjal Kumar Das Jun 12 '17 at 03:40
Sorry, I forgot to attach [link](https://github.com/peiyunh/tiny/blob/master/tiny_face_detector.m#L130) of the [network](http://ethereon.github.io/netscope/#/gist/8a0d5ef37da9dc4cd611d178404b3641) diagram. – Ujjal Kumar Das Jun 12 '17 at 03:46
@UjjalKumarDas I'm not sure if the network you provided cannot be run on different sized inputs. I'm not an expert on Keras. But when you define a network in Keras, I don't think you need to define the input size, do you? – DataHungry Jun 12 '17 at 04:16
Yes, You are right. Before this, I never had a situation that I should use input of changing dimension. After digging through internet, I found the input dimension made changing as per needed. Now I am going to make this. Thanks a lot :) . – Ujjal Kumar Das Jun 12 '17 at 04:51
@UjjalKumarDas :) – DataHungry Jun 12 '17 at 06:11

How the deep network accepts images of different scales in object detection?

1 Answers1