I have a little pet project with a friend where we're looking to create a program for real-time visualisations, mainly utilising audio/video material, controlled by MIDI. Now, the MIDI part is not a problem, you seem to find decent solutions for almost any language, but I find myself terribly unsure of which way I should look regarding handling the video in a smart way. I'm looking for both fast seeking and additional visualisations (multiple superimposed pictures for example).
I have already experimented with a couple of options that I found were extremely easy to play with and seemed to offer at least something for the task, but with each of them I felt I might run into dead ends or low performance later on when looking to add features etc. So far, I tried Pure Data, Max and Processing.
What I'm mostly asking advice for is to direct me onto an optimal or at least a decent path regarding dealing with the videos. The biggest problem is I find myself using all my time only trying to find out what programming language or library I should use. If I only got that much guidance I could finally start really working on it and advance.
I suppose I'm most comfortable with python but any suggestions are welcome. I have read a little about gstreamer and I'm thinking there might be something there, but now we're talking about a relatively low level library that would take at least some time to produce any results with, as opposed to Processing or Pure Data/Max, for instance.
In addition to the language/library I'm curious about the importance of the video format. It goes a little beyond me when we start talking about codecs, I-, P-, B-frames and whatnot. Who knows, there could even be a solution where we'd use an optimal video format, cram that baby into RAMdisk or something and get satisfactory seek speed with only that.