0

I've written a classifier and now want to apply it to detect images. My classifier's already got models for HoG features of some objects I'm interested in. I've got the sliding window thing going, which will slide across the image at different scales as well, and I can get the HoG features for each window. My questions is - what's the next step?

Is it really as simple as matching the model's HoG features against the features from the window? I understand that with integral images, there's a threshold value for each class (such as face or not face) and if the computed value of the window-generated image is close enough to the class's values and doesn't cross a threshold, then we say that we've got a match.

But how does it work with HoG features?

user961627
  • 12,379
  • 42
  • 136
  • 210

1 Answers1

3

Yes, it is as simple as that. Once you have your HOG model and your windows, you anly need to apply the window features to the models. And then select the best result (using a threshold or not, depending on your application).

Here you have a sample code that performs the same steps. The key part is the following:

function detect(im,model,wSize)
   %{
   this function will take three parameters
    1.  im      --> Test Image
    2.  model   --> trained model
    3.  wStize  --> Size of the window, i.e. [24,32]
   and draw rectangle on best estimated window
   %}

topLeftRow = 1;
topLeftCol = 1;
[bottomRightCol bottomRightRow d] = size(im);

fcount = 1;

% this for loop scan the entire image and extract features for each sliding window
for y = topLeftCol:bottomRightCol-wSize(2)   
    for x = topLeftRow:bottomRightRow-wSize(1)
        p1 = [x,y];
        p2 = [x+(wSize(1)-1), y+(wSize(2)-1)];
        po = [p1; p2];
        img = imcut(po,im);     
        featureVector{fcount} = HOG(double(img));
        boxPoint{fcount} = [x,y];
        fcount = fcount+1;
        x = x+1;
    end
end

lebel = ones(length(featureVector),1);
P = cell2mat(featureVector);
% each row of P' correspond to a window
[~, predictions] = svmclassify(P',lebel,model); % classifying each window

[a, indx]= max(predictions);
bBox = cell2mat(boxPoint(indx));
rectangle('Position',[bBox(1),bBox(2),24,32],'LineWidth',1, 'EdgeColor','r');
end

For each window, the code extracts the HOG descriptor and stores them in featureVector. Then using svmclassify the code detects the windows where the object is present.

phyrox
  • 2,423
  • 15
  • 23
  • Thanks, but what I'm concerned about is that sometimes the window-generated image is pretty big. Much bigger than the size of images used to train the classifier model. So taking the HoG features from the window-generated image generates a much larger vector than the model's feature vector, so it seems like they can't be directly compared. So what should be done? Should I scale down the window-generated patch to the size of the model's training images' size, and then do the classification? – user961627 May 22 '14 at 15:43
  • ps. I'm working in python not matlab, but i'm pretty sure the concept will be the same. also my classifier isn't svm. – user961627 May 22 '14 at 20:15
  • @user961627 sorry about the confusion matlab-python :) Regarding your problem of size: are you specifying the size (in pixels) of the cells or the number of cells per image? If you specify that one image has 4x4 cells, then all the descriptors will have the same size (independently of the image size). Instead, if you especify that each cell to be 8x8 pixles, then you will have a problem. – phyrox May 23 '14 at 08:09
  • Just added a new question here to clarify: http://stackoverflow.com/questions/23824147/choosing-normalizing-hog-parameters-for-object-detection I'm using 16x16 pixels per cell... My assumption was that the more pixels per cell I use, the more accurate/detailed my model and the better my classification. Is that true? Or is it better to use 4x4? – user961627 May 23 '14 at 08:14