I am working on a project where I am trying to extract key features of a bicycle from an overall image. I am currently investigating the use of Haar Cascades to train my computer to find certain regions of interest from said bicycles, e.g. the pedal-sprocket, seat, handle-bars. Then I will extract local features from these sub regions accordingly. The purpose is to create an overall descriptor of a particular bicycle so I can try to match it throughout a sample set of images of other bicycles.
My questions are as follows: Can I train a Haar classifier to look for a sub-component of an overall object? For example, say I want to look for the handlebars on a bicycle. How should I design the training? Should I detect the bicycle first, and then detect the handlebars within the overall bicycle region (Similar to detecting the eyes within a face in terms of facial recognition)? Since I know beforehand that all my images will contain a picture of a bicycle, I'm not sure if there is any point in detecting the bicycle to begin with and then looking for sub components.
In terms of training a Haar cascade and creating an XML that I can use (in OpenCV 3.1 and Python 3.6), could I just set up the positive and negative images with pictures of bicycles and no bicycles respectively? With the difference being that I isolate the particular area of interest by cropping the image appropriately each time (e.g. where the handlebars are)?
Also open to any recommendations about how others might solve the general problem of extracting key features for object matching. This is just one approach I am currently investigating. Thanks!