I know that generally, it's no problem to overlay HTML (and even do advanced compositing operations) to HTML5 native video. I've seen cool tricks with keying out green screens in realtime, in the browser, for example.
What I haven't see yet, though, is something that tracks in-video content, perhaps at the pixel level, and modifies the composited overlay in accordance. Motion tracking, basically. A good example would be an augmented reality sort of app (though for simplicity's sake, let's say augmenting an overlay over on-demand video rather than live video).
Has anyone seen any projects like this, or even better, any frameworks for HTML5 video overlaying (other than transport controls)?