Solutions for image processing needs like this vary wildly depending on whether you need a script to use just a few times, a software tool you'll use for a few weeks, or what could become lab automation software.
This seems to be a problem more of image matching rather than image stitching. By image matching I mean you need to find out how a subimage such as the bone section at (row 2, column 1) would match what is labeled as "4," the center left section, in the reference bone image.
The basic process:
- Load your reference image as a 2D array (first converted to grayscale)
- Load your first sample image of a subsection of bone.
- Use an algorithm such as SIFT to determine the location, orientation, and scale to fit the bone subsection image onto the reference image.
- Apply the fit criteria (x,y,rotation,scale) to the bone subsection image, transform it, and past it into a black image the same size as the reference image.
- Continue the process above to fit all subsections.
- (Optional) With all bone subsections fitted in place, perform additional image processing operations to improve the fit, fill in gaps, etc.
From your sample images it appears that the reference and the bone section images area taken using different lighting, sometimes with the flat portion of the bone slightly tilted relative to the camera's optical axis, etc., all of which makes the image match more difficult.
SIFT is an algorithm that could help here. Note that "scale invariant" is part of the algorithm name.
https://en.wikipedia.org/wiki/Scale-invariant_feature_transform
Given all that, your reference image and bone subsection images appear to be taken under very different circumstances, and that makes solving the problem harder than it needs to be. You'll have an easier time overall if you can control the conditions under which images are captured.
- Capture all images with the same camera, with the same lighting, at roughly the same distance
- For lighting, use something like a high-frequency diffuse fluorescent
- Use the same background for every image (e.g. matte black)
Making this image match a robust process means paying attention to the physical setup as well as creating your image processing algorithm.
If you need a good reference for traditional image processing techniques, find a copy of Digital Image Processing by Gonzalez and Woods. Some time spent with that book will give you better answers faster than learning image processing piecemeal online.
For practical image processing that addresses real-world concerns for implementing even simple image processing algorithms, look for Machine Vision by Davies.
I would strongly urge that you NOT look into machine learning, or try to find an answer in a more advanced image processing textbook until you run into a roadblock with more traditional methods.