4

I was hoping that I could achieve some guidance from the stackoverflow community regarding a dilemma I have run into for my senior project. First off, I want to state that I am a novice programmer, and I'm sure some of you will quickly tell me this project was way over my head. I've quickly become well aware that this is probably true.

Now that's that's out of the way, let me give some definitions:

Project Goal: The goal of the project, like many others have sought to achieve in various SO questions (many of which have been very helpful to me in the course of this effort), is to detect whether a parking space is full or available, eventually reporting such back to the user (ideally via an iPhone or Droid or other mobile app for ease of use -- this aspect was quickly deemed outside the scope of my efforts due to time constraints).

Tools in Use: I have made heavy use of the resources of the AForge.Net library, which has provided me with all of the building blocks for bringing the project together in terms of capturing video from an IP camera, applying filters to images, and ultimately completing the goal of detection. As a result, you will know that I have selected to program in C#, mainly due to ease-of-use for beginners. Other options included MATLAB/C++, C++ with OpenCV, and other alternatives.

The Problem

Here is where I have run into issues. Below is linked an image that has been pre-processed in the AForge Image Processing Lab. The sequence of filters and processes used was: Grayscale, Histogram Equalization, Sobel Edge Detection and finally Otsu Threshholding (though I'm not convinced the final step is needed).

https://i.stack.imgur.com/u6eqk.jpg

As you can tell from the image with the naked eye of course, there are sequences of detected edges which clearly are parked cars in the spaces I am monitoring with the camera. These cars are clearly defined by the pattern of brightened wheels, the sort of "double railroad track" pattern that essentially represents the outer edging of the side windows, and even the outline of the license plate in this instance. Specifically though, in a continuation of the project the camera chosen would be a PTZ to cover as much of the block as possible, and thus I'd just like to focus on the side features of the car (eliminating factors such as license plate). Features such as a a rectangle for a sunroof may also be considered but obviously this is a not a universal feature of cars, whereas the general window outline is.

We can all see that there are differences to these patterns, varying of course with car make and model. But, generally this sequence not only results in successful retrieval of the desired features, but also eliminates the road from view (important as I intend to use road color as a "first litmus test" if you will for detecting an empty space... if I detect a gray level consistent with data for the road, especially if no edges are detected in a region, I feel I can safely assume an empty space). My question is this, and hopefully it is generic enough to be practically beneficial to others out there on the site:

Focused Question:
Is there a way to take an image segment (via cropping) and then compare the detected edge sequence with future new frames from the camera? More specifically, is there a way to do this while allowing leeway/essentially creating a tolerance threshhold for minor differences in edges?

Personal Thoughts/Brainstorming on The Question:
-- I'm sure there's a way to literally compare pixel-by-pixel -- crop to just the rectangle around your edges and then slide your cropped image through the new processed frame for comparison pixel-by-pixel, but that wouldn't help particularly unless you had an exact match to your detected edges.

All help is appreciated, and I'm more than happy to clarify as needed as well.

Mike D.
  • 43
  • 1
  • 5
  • I'm sorry, but I can't really understand the question due to far too much information (and I can't tell which is relevant and which is not). Maybe I'm slow, but I think you should focus your question. – Neowizard Apr 25 '11 at 08:58
  • Tried to provide a little bit of clarification. I tend towards the wordy side as you can see. – Mike D. Apr 25 '11 at 10:56

2 Answers2

4

Let me give it a shot.

You have two images. Lets call them BeforePic and AfterPic. For each of these two pictures you have a ROI (rectangle of interest) - AKA a cropped segment.

You want to see if AfterPic.ROI is very different from BeforePic.ROI. By "very different" I mean that the difference is greater then some threshold.

If this is indeed your problem, then it should be split into three parts:

  1. get BeforePic and AfterPic (and the ROI for each).
  2. Translate the abstract concept of picture\edge difference into a numerical one.
  3. compare the difference to some threshold.

The first part isn't really a part of your question, so I'll ignore it. The last part is based basically finding the right threshold. Again out of the scope of the question. The second part is what I think is the heart of the question (I hope I'm not completely off here). For this I would use the algorithm ShapeContext (In the PDF, it'll be best for you to implement it up to section 3.3, as it gets too robust for your needs from 3.4 and on).

Shape Context is a image matching algorithm using image edges with great success rates. Implementing this was my finals project, and it seems like a perfect match (no pun intended) for you. If your edges are well, and your ROI is accurate, it won't fail you.

It may take some time to implement, but if done correctly, this will work perfectly for you. Bare in mind, that a poor implementation might run slowly and I've seen a worst case of 5 seconds per image. A good (yet not perfect) implementation, on the other hand, will take less then 0.1 seconds per image.

Hope this helps, and good luck!

Edit: I found an implementation of ShapeContext in C# @ CodeProject, if it's of any interest

Neowizard
  • 2,981
  • 1
  • 21
  • 39
  • @Neowizard since we're on the topic of image similarity, would an eigenvalue-based approach be faster since OpenCV provides straightforward implementations? A first thought would be to get the eigenvalues of both images and the find the correlation between them. Identical would mean a correlation coefficient close to unity. An experimental threshold could be defined whereby dissimilar images would be rejected. – AruniRC Apr 27 '11 at 04:23
  • [link to related article](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.84.5822&rep=rep1&type=pdf) – AruniRC Apr 27 '11 at 04:37
  • This is the first time I hear about this eigenvalue-based approach to object recognition, but after skimming over the linked article, I'm still not sure if it'll give good results in for the original problem. Defect detection represents a different problem in CV, and so I'm not sure about it. On the other hand, this is implemented (and probably optimised) in OpenCV, and that gives it a whole lot of credit. Too me it seems like a solid approach, and deserves experimenting with. If it produces good results and there's no reason to try another. – Neowizard Apr 27 '11 at 09:05
  • 1
    whichever works. actually now i think this method may not be that suitable since it's not specifically designed for use on edgemaps. Was working with eigenvalues so it popped up at first. – AruniRC Apr 27 '11 at 17:36
  • Thanks for the response and ideas! Both you and Dan Bryant came up with some great points and ideas, but perhaps more importantly some thought processes which is really what I was looking for. A big challenge I've had with the project as a novice programmer is getting to that "thinking like a CS" mentality if you will in terms of problem solving strategies. Indeed comparison of After.ROI vs Before.ROI is exactly the approach I ended up using and was able to achieve a quite decent success rate with minimal false positives. – Mike D. Apr 28 '11 at 08:05
  • @AruniRC Eigenvalues were definitely a topic/strategy I came across in my research, but it seemed a bit more high-level than I had the time or knowledge to implement effectively. Most of my effort was built around the knowledge of the Aforge C# library vs Open/EmguCV, and such a feature is not implemented there. However, as OpenCV tends to be the top of the line/the standard for CV programming, it is as Neowizard said definitely worth further experimenting and research for the potential to assist in more robust solutions. – Mike D. Apr 28 '11 at 08:09
3

I take on a fair number of machine vision problems in my work and the most important thing I can tell you is that simpler is better. The more complex the approach, the more likely it is for unanticipated boundary cases to create failures. In industry, we usually address this by simplifying conditions as much as possible, imposing rigid constraints that limit the number of things we need to consider. Granted, a student project is different than an industry project, as you need to demonstrate an understanding of particular techniques, which may well be more important than whether it is a robust solution to the problem you've chosen to take on.

A few things to consider:

  1. Are there pre-defined parking spaces on the street? Do you have the option to manually pre-define the parking regions that will be observed by the camera? This can greatly simplify the problem.

  2. Are you allowed to provide incorrect results when cars are parked illegally (taking up more than one spot, for instance)?

  3. Are you allowed to provide incorrect results when there are unexpected environmental conditions, such as trash, pot holes, pooled water or snow in the space?

  4. Do you need to support all categories of vehicles (cars, flat-bed trucks, vans, delivery trucks, motorcycles, mini electric cars, tripod vehicles, ?)

  5. Are you allowed to take a baseline snapshot of the street with no cars present?

As to comparing two sets of edges, probably the most robust approach is known as geometric model finding (describing the edges of interest mathematically as a series of 'edgels', combining them into chains and comparing the geometry), but this is over-kill for your application. I would look more toward thresholds of the count of 'edge pixels' present in a parking region or differencing from a baseline image (need to be careful of image shift, however, since material expansion from outdoor temperature changes may cause the field of view to change slightly due to the camera mechanically moving.)

Dan Bryant
  • 27,329
  • 4
  • 56
  • 102
  • Dan, thanks for your ideas and assistance. A lot of your "points to consider" were things I had already begun to consider, think about, or research, but you've named a number of useful points and strategies to consider, some of which definitely went into my ultimate design and code. I wish I could give both you and Neowizard the accepted answer check as you both deserve it due to the beneficial information you've provided. – Mike D. Apr 28 '11 at 08:11