Computer Vision/Image Processing frameworks

Question

I'm curious to know if there are any image processing/computer vision frameworks out there that allow you to create a filter pipeline by dynamically creating chains of filters/filter blocks (similar to simulink blocks in MATLAB).

The idea is mostly inspired by RoboRealm, but I'd like to implement this mostly in C/C++ with the ability to graphically build image processing pipelines. I'm familiar with one such framework, Camunits, which I shall use as a foundation to build this graphical filter framework, but please do let me know if you are aware of any. CamUnits integrates well with LCM (Lightweight Communications and Marshalling) which handles most of the marshalling and networking needs that I'd like to avoid for now. Furthermore, CamUnits also integrates well with the logging framework within LCM, and has a bunch of tools for image acquisition (firewire cameras, automatic gain/exposure correction, fast de-bayering etc).

In short, I'd like to have the functionality to build a graphical interface that lets you dynamically create image processing pipelines (threaded if-needed), which would in turn help in rapid prototyping of image processing/computer vision algorithms. I'm also curious to know if there'd be any interest in this type of framework (modular, and quickly/highly reconfigurable).

Microsoft's DirectShow, and in particular the FilterGraph and associated plugins did exactly this for video and audio decoding, muxing and demuxing, but the principle could have been used for image processing pipelines. From past experience, this is not a sensible thing to do in C++, you get very little benefit, and have to suffer far too many typing constraints. Pick a toolset/ language/ library with appropriate support for building untyped filter architectures and enough syntactic sugar that lets you get on with addressing the real problems. — Andrew Walker, Jul 26 '12 at 12:57
As others have said here, there are many frameworks that do this (in fact, most of the image processing ones I've seen can be rigged up in this fashion). Apple's Core Image framework on Mac and iOS is built around this structure, and its Quartz Composer tool even lets you do the graphical drag and drop connection of filters, inputs, and outputs. I wrote my own open source iOS framework along these lines, with modular filters or processing operations that you chain together and can swap out as needed. I even know someone who has build a GUI for rapid prototyping of filter chains from this. — Brad Larson, Jul 26 '12 at 19:35

Francesco Callari · Answer 1 · 2012-07-26T13:23:14.340

This is (almost) the oldest idea in the zoo of image processing applications: the "kitchen sink" GUI app where filters are boxes, images are input to the left, data flow through boxes, images come out to the right.

The oldest I remember using firsthand was Khoros (and that may tell you how old I am), but am almost positive that the people at Xerox had something similar way earlier than that. More recently, a host of image compositing apps have used a similar UI approach, most notably Shake.

In my experience, they are quite useful for algorithm exploration, but I have never seen one where the GUI didn't get in the way of getting things done when the problems started getting complicated. "Visual computing" is appealing for getting the rough outline of a solution, but there is a reason why harder problems are best reasoned upon and communicated using equations - it's a more concise notation that dispenses with hundres of useless bubbles and lines drawn upon a screen.

In production practice, the usefulness of these apps ends up being tied to their output scripting capabilities: mouse-dragging gets quickly tiresome when you do find a solution to your problem, and you want to apply it to a truckload of images. Then the app better have a way to output code implementing the image transformation in a way that's easy to interface with the rest of your codebase.

score -1 · Answer 2 · answered Jan 05 '15 at 09:02

Cassandra is a visual programming environment to model algorithms that lets you simply develop algorithms for image processing and signal processing. The extensive integration of libraries such as OpenCV and IPP functions makes Cassandra a highly effective development platform and allows you to reach a solution faster than with conventional programming languages.

You can use Cassandra for a range of applications, including signal processing and image and video processing, e. g. camera-based driver assistance system etc. Numerous scientific institutions use Cassandra, the visual programming environment for image processing with C++.

score -3 · Answer 3 · answered Jul 27 '12 at 20:04

We have a product that's almost ready for release ("PrecisionImage.NET" at www.CoreOptical.com) that falls along these lines. It's not C++ and it doesn't have a graphical UI for dragging/dropping a filter chain into place, but it is flexible, powerful and easy to use. It is a "pure .NET" assembly and interfaces with the WIC imaging subsystem in WPF. At the moment it's pervasively threaded to utilize all the CPU power in the host computer automatically, and in the next month or two we'll be adding a GPU-processing subsystem for CUDA-enabled devices. This will still be a "Pure .NET" solution with no unmanaged components even with the GPU (the GPU code is JIT'ed into PTX code that interacts with the GPU driver directly), so you can use any .NET language that is CLS-compliant including C#/VB/F#. At the moment, however, we only have examples in C#. Essentially it's a class library that allows for the assembly of processing chains without accruing discretization errors. We have several examples online that show how this is done.

Computer Vision/Image Processing frameworks

3 Answers3