A popular image/video analysis library I would recommend is OpenCV. I have used this github fork of the ruby-opencv gem with success. If you scroll down on the readme file, you'll see an example on face detection. The unit tests demonstrate how to do other things like drawing shapes and such. At a glance, I don't see any tests on extracting pixel data, but it most definitely is possible.
If you need something more simple, you can try out devil. It's more user-friendly and is focused on image manipulation, but you can probably extract pixel data with it.
It sounds like you'll be leaning towards OpenCV. It might be useful to look at this previous question, specifically the mention of the Hough transform