2

I'm trying to calculate the distance between two histograms in torch7, in order to do this I was thinking about using the earth mover's distance. Now I know it's not that hard to do this in python using something like https://github.com/garydoranjr/pyemd however I have my data in torch and need to execute this computation many times. As such moving the entire data between torch7 and python is not an option.

So my question is what is the fastest earth mover distance calculator in torch7? I have searched but could not find anything like a library and was hoping there is some better way to implement this then line by line translation of python code especially seeing as how torch is often better at handling things on the gpu.

Edit I have found this but am not sure how to use it.

I currently have the following code:

    function ColourCompareHistEMD (imagers)
        sumdistance=0
        k={}
        for i=1,$images do 
            k[i]=torch.bhistc(images[i],20,-100,100)
        end

        for i=1,$images do 
           for j=1,$images do 
                #what to do here? 
           end
        end
    end


My current best guess is something like this:

function ColourCompareHistEMD (images)
    sumdistance=0
    r={}
    for i=1,#images do 
        print(images[i])

        r[i]=torch.histc(images[i][1]:view(images[i][1]:nElement()),20,-100,100)
    end

    for i=1,#images do 
       for j=1,#images do 
            criterion = nn.EMDCriterion()
            criterion:forward(r[i],r[j])
            sumdistance=sumdistance+criterion.loss          

       end
   end

return sumdistance
end 

but that doesn't seem to work as criterion.loss isn't working and it gives me an error

/home/thijser/torch/install/bin/luajit: bad argument #2 to '?' (out of range at /home/thijser/torch/pkg/torch/generic/Tensor.c:704)
stack traceback:
    [C]: at 0x7f2048fdc530
    [C]: in function '__newindex'
    /home/thijser/torch/install/share/lua/5.1/EMDCriterion.lua:52: in function 'preprocess'
    /home/thijser/torch/install/share/lua/5.1/EMDCriterion.lua:255: in function 'forward'
    imageSelector.lua:343: in function 'evalHueImages'
    imageSelector.lua:66: in function 'evaluate'
    imageSelector.lua:81: in function 'SelectTop'
    imageSelector.lua:151: in function 'evolve'
    imageSelector.lua:158: in function <imageSelector.lua:156>
    [C]: in function 'dofile'
    ...jser/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
    [C]: at 0x5641c3f40470

But am not sure how to use it such that in the comment the earth mover distance between image i and j is calculated.

Thijser
  • 2,625
  • 1
  • 36
  • 71
  • I hope this is somewhat clear, I have been trying to find a good way of doing this for a while now. – Thijser Jul 02 '17 at 09:26
  • Is there any particular reason for why you're only finding the histogram for the first row of every image with `images[i][1]:view(images[i][1]:nElement())`? – woodstockhausen Jul 07 '17 at 16:24
  • @DCSmith no reason for that other then histc trowing an error when given a higher dimensional input and wanting a bit more speed. – Thijser Jul 08 '17 at 08:08

1 Answers1

1

It appears that EMDCriterion expects the input and target to be at least 2-dimensional. It also expects the points in your comparison to be laid out horizontally. Since the result of torch.histc is 1-dimensional, you can reshape it into 2-dimensional row tensor like so:

for i=1,#images do 
    print(images[i])
    local hist = torch.histc(images[i][1]:view(images[i][1]:nElement()),20,-100,100)
    r[i] = hist:reshape(1,hist:nElement())
end

Additionally, I tried running the code

criterion:forward(r[i],r[j])
print(criterion.loss)

and the result was nil. Try this instead for accumulating the losses:

local loss = criterion:forward(r[i],r[j])
sumdistance = sumdistance + loss

Also, it'll be a bit more efficient if you define the criterion criterion = nn.EMDCriterion() outside of the nested for-loop.

woodstockhausen
  • 339
  • 1
  • 6