0

I am trying to stitch images taken by a camera that moves on a known helical trajectory, with known step size and straight axis. The camera takes pictures of the internal surface of a cylinder within which it moves. The helix and cylinder have the same axis.

My goal: to use the pictures captured by the camera to reconstruct the internal surface of the cylinder in the form of a cylindrical surface, and then flatten this surface to visualize the cylinder unwrapped (like a sheet of paper that you wrap to be a cylinder and then unfold to go back to a rectangle).

Example of the application: a camera is installed on a device which has a threaded center, and this moves on a threaded axis. The system is coaxial with a pipe and through the images we want to see the cylindrical map of the pipe and inspect it to check for defects.

I have studied standard panorama image stitching algorithms, but I am not able to define the pipeline and instruments necessary for this kind of application.

I believe these could be the steps:

  1. Calibrate the camera (in my case, I have a camera with horizontal FoV = 132°, vertical FoV = 65°, so I am using OpenCV's fisheye camera calibration and have obtained good undistortion results).
  2. Select initial keyframe from the video stream.
  3. Select a new keyframe from the video stream so that its overlap with the previous keyframe is >50%.
  4. Extract features in current keyframe and previous keyframe.
  5. Match features across the adjacent keyframes.
  6. Stitch the keyframes together.
  7. Project the stitched image on a cylindrical map of known radius.
  8. Iterate points 3. to 7. to update the map (need to make the map "grow" along the cylindrical axis).
  9. Stop the map generation, flatten the cylinder into a rectangle.

I don't know if my pipeline is correct and, most importantly, how to implement steps 7 and 8 (make the map "grow"). I would like to use Python and OpenCV.

Moreover, I think that if a rotation encoder were available, image stitching could be performed even without feature matching; by knowing the relative transformation of the two frames. Indeed, moving on a helical path means there's only one degree of freedom.

Best regards

nndei
  • 1
  • Indeed, since you know where each image was taken, you should be able to just shift each image to its know coordinates. The only thing you need to take from the standard panorama code is the blending of the images where they overlap. If you image the inside of a cylinder, you probably don’t even need to apply the undistortion logic. – Cris Luengo Jul 11 '23 at 18:07
  • Please add some initial code and sample images that you have started with. Try to focus on one specific problem at a time. – Markus Jul 12 '23 at 15:31
  • You can try the [stitching](https://github.com/OpenStitching/stitching) package, which is a wrapper around the opencv stitching module giving you the possibility to understand deeply the workflow behind the stitching. You can open a discussion with your specific use case – Lukas Weber Jul 12 '23 at 19:10

2 Answers2

0

interesting project
for context of the used directions i assume your cylinder stands upright and you scan from bottom to top and left to right. so you would stitch the images horizontally together and grow your image mostly in width (an a little bit in height). i see no big difference in growing the image vertically, when you already managed to grow your image horizontally but perhaps i'm missing something. for the cylinder-growing-part (step 7, 8) i would suggest to also grow it as a rectangle: select some features from the top end of the beginning of your stitched image and constantly look for those features on the bottom-end of your current stitching-position. and if you match those features you have found your period 2*pi*R. so you cut your image there vertically and stitch the first part to the bottom of the second part with feature-matching and then go on stitching horizontally. repeat.

Schiffer7
  • 21
  • 6
0

Point 7 is the only nontrivial step.

It boils down to an index-and-projection operation: you map a location on the cylinder to camera (or cameras) that took an image of it, and thence to images of that location. The idea is to tessellate the cylinder into planar quad(rilateral)s that approximate its surface. The lower limit of the tassellation is a quad as small as 1 pixel, but you'll normally want to choose something larger for efficiency. Then, you iterate over the quads of the tassellation, and map (index) each of them into a set of camera images that observed the quad. To limit distortion, you'll normally want to select cameras whose focal axis is closely aligned with the quad's normal vector - in the limit, select the camera whose axis is most parallel to it. You can then project the corners of the quad into each selected camera, obtaining the texture coordinates of the quad.

If you go for just one camera, these coordinates can directly be passed to your renderer as texture coordinates for the quad, and you are done: render all the quads (say, in OpenGL), each textured with its own image. If you want to do some blending, it's best done in two steps: first rectify the quad's projections into all the cameras into a common square (the quad itself, represented as a square of pixels with chosen resolution), then apply your favorite blending algorithm (e.g. multiband) to them, obtaining the final texture for the quad.

Francesco Callari
  • 11,300
  • 2
  • 25
  • 40