How to make multiple size of detection on sliding window?

Question

I am doing a research on people detection using HOG and LBP. I would like to detect multiple size people on image. I am using a loop on scale for the window size of detection then it will proceed by sliding window detection to detect the matching features on images. However, my code shows error due to different dimensions of matrices. Here is my code :

win_size = [32, 32];  %the window size of detection

%loop on scale of window size
for s=0.8:0.2:1

    X=win_size(1)*s;
    Y=win_size(2)*s;

    %loop on column of image
    for y = 1:X/4:lastRightCol-Y

        %loop on row of image
        for x   = 1:Y/4:lastRightRow-X

            p1  = [x,y];
            p2  = [x+(X-1), y+(Y-1)];
            po  = [p1; p2] ;


            % CROPPED IMAGE
            crop_px    = [po(1,1) po(2,1)];
            crop_py    = [po(1,2) po(2,2)];

            topLeftRow = ceil(min(crop_px));
            topLeftCol = ceil(min(crop_py));

            bottomRightRow = ceil(max(crop_px));
            bottomRightCol = ceil(max(crop_py));

            cropedImage    = im(topLeftCol:bottomRightCol,topLeftRow:bottomRightRow,:);

            %Get the feature vector from croped image
            HOGfeatureVector{counter}= getHOG(double(cropedImage));
            LBPfeatureVector{counter}= getLBP(cropedImage);
            LBPfeatureVector{counter}= LBPfeatureVector{counter}';
            boxPoint{counter} = [x,y,X,Y];
            counter = counter+1;
            x = x+2;

        end
    end
end

I noticed the problem is on HOGfeatureVector{counter}, since i am using different window size, the features that I got from HOG also has different dimension. For example, the original scale of my window size is 32x32, then it will give me the dimension after extracting features from HOG as <6256x324>. Then, if I put the scale on window size, for example : 0.8:0.2:1, it will give me different dimension, since the scale of 0.8, it will give me <6256x144> and the scale of 32, <6256x324>. I noticed, it is impossible to combine this two different matrices dimension by using simple concatenation.

Any one has idea how to solve my problem? At least, how to combine two different dimensions of matrices?

Thank you

`s` is the window size, and `X` and `Y` is yes the windows dimension after multiplying with the original window size, `s`. As stated above, `s = 0.8:0.2:1`. So, when`s` is `0.8`, I will get the window's dimension for `Y` is equal to `0.8*32` and same goes for `X`. @ifryed — Indrasyach, Jul 03 '14 at 09:24

ifryed · Accepted Answer · 2014-07-04T04:55:37.207

2

You need to keep the detection window the same size, the HOG is trained to find the object in 32X32. If you want to find the object in multi-scale then you need to re-scale the image, but not the detection window.

Change this line:

X=win_size(1)*s;
Y=win_size(2)*s;

To this:

X=win_size(1);
Y=win_size(2);

And it should work.

edited Jul 04 '14 at 04:55

answered Jul 03 '14 at 11:22

ifryed

605
9
21

thank you @ifryed, it worked actually for detection window. But here, my problem is I want to creat multiple detection window. For examples, people near from camera has the head size of `<32x32>` pixel and for people far from camera has the head size of `<24x24>` pixel. Here, I want to make a system which able to detect different scale of size. Do you have any idea how to scratch a code for it? thx @ifryed. – Indrasyach Jul 04 '14 at 07:19
I added the code, the idea is to scan for the head with the same detector (32X32) but on different scale of the image, if the face is far (appears small in the picture) if you re-scale the image and then scan it again it should work. That is how you get `multi-scale` detection. – ifryed Jul 04 '14 at 13:00
you meant that, I cant do multi-scale detection in one image and one process. Since here, you said I had to re-scale the image and then re-scan it, means after scanning for original size `32x32`, I have to re-scale the image for `24x24` and re-scan it again. Is it correct? or I guess I dont get what you said. @ifryed thx, – Indrasyach Jul 05 '14 at 06:22
You have the detector that is at size 32 X 32, if you run it on the original image then you'll find all the heads that are of size 32X32, but if you re-size the image to be 2nX2m (where n is the height and m is the width of the original image) and scan the image again, you'll find all the heads that where of size 16X16 in the original image. so if you want to find the heads that are of size 24X24 then re-scale the image by 32/24 = 1.33. – ifryed Jul 05 '14 at 20:35
you meant here, I need to resize the image and no need to re-scale the detector size, am I right? @ifryed – Indrasyach Jul 06 '14 at 23:33

How to make multiple size of detection on sliding window?

1 Answers1