Intersection between 2d image point and 3d mesh

Question

Given: Mesh, Source Camera - I have intrinsic and extrinsic parameters, Image coordinate 2d

Output: 3D point, which is the intersection of a ray from camera center, through the 2d point on the image plane and the mesh. (I'm trying to find the 3d point on the mesh)

This is the process:

From Multiple View Geometry in Computer Vision book:

I have constructed the equation (6.14).

I'm not sure how to continue and get the 3d point that lies on the mesh (I also need the point that is closet to the camera).

I thought that it can be done in the following way:

Iterate over all the vertices and find the distance between the vertex and the line and the vertices that have the least distance lie on the line (if they're close to zero or zero), and finding the closet vertex is I guess finding the magnitude of between the center of the camera and the closet vertices, the smallest one will mean the point is the closest?

Quick update: This repo does seem to work with the rays: github.com/szabolcsdombi/python-mesh-raycast

I guess the bug now lies in getting the point D right..

Isn't this "just" the problem of [intersecting an image ray with a 3D mesh](https://www.scratchapixel.com/lessons/3d-basic-rendering/ray-tracing-polygon-mesh/Ray-Tracing%20a%20Polygon%20Mesh-part-1)? I think googling for raytracing provides some nice solutions if that is what you are looking for. — Grillteller, Apr 07 '21 at 08:04
@Grillteller Thanks for your answer, I even found a repo in python that does exactly but when running it, it doesn't work unfortunately. this is the repo: https://github.com/szabolcsdombi/python-mesh-raycast After getting the triangles from my mesh, the camera center vector and the second point D (as seen in the image in the original question), it doesn't output anything from the ray it casted between the points. — Ilan Aizelman WS, Apr 08 '21 at 13:17
@IlanAizelmanWS Have you solved this problem? I am facing a similar problem. I tried with the trimesh repository but it gives wrong 2d-3d correspondences. — SDD123, Jul 13 '22 at 15:25

score 4 · Answer 1 · answered Apr 09 '21 at 06:34

As Grillteller pointed out in the comment, this is a ray intersection problem with the 3d mesh. As far as I know, humanity does not yet know a quick way to determine the intersection for an arbitrary mesh. In your problem context, you should Ray Tracing, which is also pointed out by Grillteller, however this has serious performance issues, although it gives a lot of shading possibilities. To find the intersection of a ray and a mesh, the Ray Tracing algorithm typically uses different acceleration structures. Often such structures are a partition of space by trees:

KD-tree for Ray Tracing https://graphics.stanford.edu/papers/gpu_kdtree/kdtree.pdf
BSP-tree for Ray Tracing https://www.sci.utah.edu/publications/ize08/BSP_RT08.pdf
Octree for Ray Tracing https://www.researchgate.net/publication/3410767_Octree-R_An_Adaptive_Octree_for_Efficient_Ray_Tracing

This presentation explains some of these and other approaches very well.

P.S .: If you only need a simple visualization, then it would be better to reverse the problem: for each mesh element, perform rasterisation.

Grillteller · Answer 2 · 2021-04-12T08:25:39.243

I found another implementation called trimesh using python.

You need to read to installation guide and then you are able to load your meshes via:

import numpy as np 
import trimesh

# attach to logger so trimesh messages will be printed to console 
trimesh.util.attach_to_log()

mesh = trimesh.load('models/CesiumMilkTruck.glb', force='mesh')

I found the relevant lines to import a camera in scene as trimesh.scene.Camera. Then you can use the function cameras_to_rays(camera) (line 417) to "return one ray per pixel, as set in camera.resolution".

So now you are having the rays for every pixel and the mesh and can create a RayMeshIntersector as shown in ray_triangle.py. Then, you can use intersects_location (line 75) to calculate cartesian image coordinates where a respective ray hits the mesh.

I found an example for your purpose here:

"""
raytrace.py
----------------
A very simple example of using scene cameras to generate
rays for image reasons.
Install `pyembree` for a speedup (600k+ rays per second)
"""
from __future__ import division

import PIL.Image

import trimesh
import numpy as np

if __name__ == '__main__':

    # test on a simple mesh
    mesh = trimesh.load('../models/featuretype.STL')

    # scene will have automatically generated camera and lights
    scene = mesh.scene()

    # any of the automatically generated values can be overridden
    # set resolution, in pixels
    scene.camera.resolution = [640, 480]
    # set field of view, in degrees
    # make it relative to resolution so pixels per degree is same
    scene.camera.fov = 60 * (scene.camera.resolution /
                             scene.camera.resolution.max())

    # convert the camera to rays with one ray per pixel
    origins, vectors, pixels = scene.camera_rays()

    # do the actual ray- mesh queries
    points, index_ray, index_tri = mesh.ray.intersects_location(
        origins, vectors, multiple_hits=False)

    # for each hit, find the distance along its vector
    depth = trimesh.util.diagonal_dot(points - origins[0],
                                      vectors[index_ray])
    # find pixel locations of actual hits
    pixel_ray = pixels[index_ray]

    # create a numpy array we can turn into an image
    # doing it with uint8 creates an `L` mode greyscale image
    a = np.zeros(scene.camera.resolution, dtype=np.uint8)

    # scale depth against range (0.0 - 1.0)
    depth_float = ((depth - depth.min()) / depth.ptp())

    # convert depth into 0 - 255 uint8
    depth_int = (depth_float * 255).round().astype(np.uint8)
    # assign depth to correct pixel locations
    a[pixel_ray[:, 0], pixel_ray[:, 1]] = depth_int
    # create a PIL image from the depth queries
    img = PIL.Image.fromarray(a)

    # show the resulting image
    img.show()

    # create a raster render of the same scene using OpenGL
    # rendered = PIL.Image.open(trimesh.util.wrap_as_stream(scene.save_image()))

Fedor · Answer 3 · 2022-10-28T17:58:25.350

The problem in the question is to find the closest point on 3D mesh visible in specific 2D point of screen and it is a part of Ray tracing technique. The ray in question is uniquely defined by the camera location (ray's origin) and the pixel location, which the ray penetrates. So knowing both of them allows one to specify the ray and find its intersection (if any) with the triangular surface.

It is rather computationally expensive task especially for high resolution screens (millions of pixels) and detailed meshes (millions of triangles), so a number of highly optimized software libraries where developed for it, for example:

Nvidia OptiX uses GPU for fast finding of ray-surface intersections. One can find a wrapper library for python.
Intel Embree does the same on x86 processors. Python wrappers: python-embree and pyembree. The latter is a dependency of trimesh for fast queries.
And there are libraries not only from hardware vendors with python interface that can quickly find ray-mesh collisions, e.g. MeshLib.

Intersection between 2d image point and 3d mesh

3 Answers3

Linked