I have a piece of MATLAB code which takes a 91x91 patch of pixels from an image and apples HOG to extract its feature vectors. I would like to rewrite the function in Python. I've been struggled for a while trying to find out how to get the same HOG return values in Python as it was in MATLAB but failed to do so. I will be really appreciate if you can provide any help.
The VLFeat library(http://www.vlfeat.org/overview/hog.html) is used in the MATLAB code and I am using scikit-image in Python(http://scikit-image.org/docs/dev/api/skimage.feature.html?highlight=peak_local_max#skimage.feature.hog).
In Matlab,the input 'im2single(patch)' is a 91*91 array, while the returned data type of Hog is 4*4*16 single.HoG is applied using a cell size of 23 and the number of orientation of 4.
hog = vl_hog(im2single(patch),23, 'variant', 'dalaltriggs', 'numOrientations',4) ;
The returned data is 4*4*16 single, which can be displayed in the form of:
val(:,:,1) =
0 0 0 0
0 0 0 0
0 0.2000 0.2000 0.0083
0 0.2000 0.2000 0.0317
....
val(:,:,16) =
0 0 0 0
0 0 0 0
0 0 0.0526 0.0142
0 0 0.2000 0.2000
Then the result is flattened into a 256*1 feature vector manually. To sum up, in a 91*91 patch of pixels, a 256*1 feature vector is extracted. Now I want to get the same result in Python.
In my Python code, I tried to apply HOG with the same cell size and number of orientations.The block size is set to (1,1)
tc = hog(repatch, orientations=4, pixels_per_cell=(23,23), cells_per_block= (1,1), visualise=False, normalise=False)
I appended the size of the patch to 92*92, so the patch size is the integer multiple of the cell size. The input array is now called 'repatch'. However, the output 'tc' is a 64*1 array(the gradient histograms is flattened to the feature vector)
tc.shape
(64,)
Then I looked into the Skimage source code,
orientation_histogram = np.zeros((n_cellsy, n_cellsx, orientations))
orientation_histogram.shape
(4, 4, 4)
Here the n_cellsx is: number of cells in x and n_cellsy is: number of cells in y. It seems like the output of the Hog is highly related to the dimension of the orientation_histogram.
The actual dimension of the HoG returned valued is determined by:
normalised_blocks = np.zeros((n_blocksy, n_blocksx,by, bx, orientations))
Where n_blocksy, n_blocksy are calculated by:
n_blocksx = (n_cellsx - bx) + 1
n_blocksy = (n_cellsy - by) + 1
n_cellsx is: number of cells in x,the value of which is 4 here, so is n_cellsy; bx,by is cells_per_block, which is (1,1); orientations is 4 in this case.
It seems like the size of returned value (normalised_blocks) is calculated by 4*4*1*1*4 (n_blocksy * n_blocksx * by * bx * orientations)
I've tried to change the block size but still cannot get what I was expected... (while the block size is (2,2) the returned value is a 144*1 array)
Can anyone please help... How can I get the same Hog output as in Matlab? Many thanks.