4

I'm trying to build a simple leaf recognition app with Android and OpenCV; my database consist in just 3 entries (3 pictures of 3 types of leaves) and I would like to be able to recognise if one of the pictures in the database is inside another picture captured by the smartphone. I'm using the SURF method for extract keypoints from the database images and then compared them with the extracted keypoints of the captured image looking for a match. My problem is that the result appears as a "color matching", more than a "feature matching": when I compare a picture from the database and the one captured, the number of matches is equal with all 3 entries and thus I get a wrong matching.

This one of the picture from the database (note that is without backgroud)

picture from database

And this is the result that I get:

Screenshot with matches

Image on top is the one captured from the smartphone and the image below is the result with matches highlighted.

Here is the code that I implemented:

Mat orig = Highgui.imread(photoPathwithoutFile);
Mat origBW = new Mat();
Imgproc.cvtColor(orig, origBW, Imgproc.COLOR_RGB2GRAY);
MatOfKeyPoint kpOrigin = createSURFdetector(origBW);
Mat descOrig = extractDescription(kpOrigin, origBW);
Leaf result = findMatches(descOrig);
Mat imageOut = orig.clone();
Features2d.drawMatches(orig, kpOrigin, maple, keypointsMaple, resultMaple, imageOut);


public MatOfKeyPoint createSURFdetector (Mat origBW) {
    FeatureDetector surf = FeatureDetector.create(FeatureDetector.FAST);

    MatOfKeyPoint keypointsOrig = new MatOfKeyPoint();

    surf.detect(origBW, keypointsOrig);

    return keypointsOrig;
}

public Mat extractDescription (MatOfKeyPoint kpOrig, Mat origBW) {
    DescriptorExtractor surfExtractor = DescriptorExtractor.create(FeatureDetector.SURF);

    Mat origDesc = new Mat();

    surfExtractor.compute(origBW, kpOrig, origDesc);

    return origDesc;
}

public Leaf findMatches (Mat descriptors) {
    DescriptorMatcher m = DescriptorMatcher.create(DescriptorMatcher.BRUTEFORCE);
    MatOfDMatch max = new MatOfDMatch();
    resultMaple = new MatOfDMatch();
    resultChestnut = new MatOfDMatch();
    resultSwedish = new MatOfDMatch();
    Leaf match = null;

    m.match(descriptors, mapleDescriptors, resultMaple);
    Log.d("Origin", resultMaple.toList().size()+" matches with Maples");
    if (resultMaple.toList().size() > max.toList().size()) { max = resultMaple; match = Leaf.MAPLE; }
    m.match(descriptors, chestnutDescriptors, resultChestnut);
    Log.d("Origin", resultChestnut.toList().size()+" matches with Chestnut");
    if (resultChestnut.toList().size() > max.toList().size()) { max = resultChestnut; match = Leaf.CHESTNUT; }
    m.match(descriptors, swedishDescriptors, resultSwedish);
    Log.d("Origin", resultSwedish.toList().size()+" matches with Swedish");
    if (resultSwedish.toList().size() > max.toList().size()) { max = resultSwedish; match = Leaf.SWEDISH; }

    //return the match object with more matches
    return match;
}

How can I get a more accurate matching not based on colours but on actual singularities of the picture?

Nicholas Allio
  • 717
  • 2
  • 9
  • 28

1 Answers1

1

Well, SURF is not the best candidate for this task. SURF descriptor basically encodes some gradient statistics in a small neighborhood of a corner. This gives you invariance to lot of transformations, but you lose the 'big picture' when doing this. This descriptor is used to narrow down a range of correspondences between points to be matched, and then some geometric contraints come into play.

In your case it seems that descriptors are not doing a great job at matching points, and since there are a LOT of them each point eventually gets a match (although it is strange that geometric testing didn't prevent that).

I can advice you to try different approach to matching, maybe HOG with descriptors trained to detect leaf types, or even something contour-based, since shape is what is really different between your images. You can for example detect leaf's outline, normalize it's length, find it's center and then in equal intervals calculate distance from each point to the center - that will be your descriptor. Than find the largest length and circularly shift this descriptor to start at the extrema and divide by this value - that will give you some basic invariance to choice of contour starting point, rotation and scale. But that will most likely fail under perspective and affine transformations.

If you would like to experiment further with feature points - try to detect less of them ,but more representative ones (filter by gradient strength, corner score or something). Maybe use SIFT instead of SURF - it should be a bit more precise. Check for amount of inliers after matching - best match should have higher ratio.

But honestly, this seems more like a machine learning task than computer vision.

Edit: I have checked your code and found out that you are not performing geometric checks on matches, hence why you are getting incorrect match. Try performing findHomography after matching and then consider only points that have been set to one in mask output argument. This will make you only consider points that can be warped to each other using homography and may improve matching a lot.

Edit2: added a code snippet (sorry, but I can't test Java at the moment, so it's in Python)

import cv2
import numpy as np

# read input
a = cv2.imread(r'C:\Temp\leaf1.jpg')
b = cv2.imread(r'C:\Temp\leaf2.jpg')

# convert to gray
agray = cv2.cvtColor(a, cv2.COLOR_BGR2GRAY)
bgray = cv2.cvtColor(b, cv2.COLOR_BGR2GRAY)

# detect features and compute descriptors
surf = cv2.SURF() # better use SIFT instead
kp1, d1 = surf.detectAndCompute(agray,None)
kp2, d2 = surf.detectAndCompute(bgray,None)
print 'numFeatures1 =', len(kp1)
print 'numFeatures2 =', len(kp2)

# use KNN matcher
bf = cv2.BFMatcher()
matches = bf.knnMatch(d1,d2, k=2)

# Apply Lowe ratio test
good = []
for m,n in matches:
    if m.distance < 0.75*n.distance:
        good.append(m)

print 'numMatches =', len(matches)
print 'numGoodMatches =', len(good)

# if have enough matches - try to calculare homography to discard matches 
# that don't fit perspective transformation model
if len(good)>10:
    # convert matches into correct format (python-specific)
    src_pts = np.float32([ kp1[m.queryIdx].pt for m in good ]).reshape(-1,1,2)
    dst_pts = np.float32([ kp2[m.trainIdx].pt for m in good ]).reshape(-1,1,2)

    M, mask = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC,5.0)
    print 'numMatches =', sum(mask.ravel().tolist()) # calc number of 1s in mask

else:
    print "not enough good matches are found"

It gives me following output for different leaves using SURF

numFeatures1 = 685
numFeatures2 = 1566
numMatches = 685
numGoodMatches = 52
numMatches = 11

You can see that the amount of 'real' matches is very small. But unfortunately numMatches is similar when we match different images of same leaf type. Maybe you can improve the result by tweaking parameters, but I think using keypoints here is just a not very good approach. Maybe it is due to the leaf variation even within a same class.

alexisrozhkov
  • 1,623
  • 12
  • 18
  • Interesting! Could you please attach a snippet of code for doing that? Because I didn't get exactly what you mean: I tried to perform the `findHomography` after matching but I think I'm doing something wrong.. Thanks – Nicholas Allio Dec 15 '15 at 09:18
  • 2
    Edited answer to include code snippet. I have used your leaf images as input, scaled to be 512 pixels in height. – alexisrozhkov Dec 15 '15 at 10:12