0

I am using WHENet to estimate head pose of an individual filmed by two cameras. The model gives me roll, pitch and yaw relative to each camera. Roll, pitch and yaw equal 0 when the indidual faces exactly the camera. My camera are calibrated (both intrinsically and extrinsically) and I know their position in world coordinates. I want to convert the Euler angles (roll, pitch, yaw) given by the model from local (each camera) coordinates to world coordinates using OpenCV in Python. This would allow me for example to get the 3D position of a point 20 cm behind the person filmed.

I tried to use the following function "eulerAnglesToExtRef" to convert the angles, without succcess. When I try to display the angles obtained with the function "draw_euler_axes", I get results far from the ones featured in the model (I would in particular expect the z-axis to be displayed as the antero-posterior axis). How can I convert the local rotations to global rotations and use them to display axes properly?

    def eulerAnglesToExtRef(eulerAngles, T, initMat=np.array([[1, 0, 0],[0, -1, 0],[0, 0, -1]])):
        R_camera_external = T[:3, :3]
        R_object_local = EpipolarUtils.euler_to_rotation_matrix(eulerAngles)
        R_object_camera = initMat @ R_object_local
        R_object_external = R_camera_external @ R_object_camera
        external_euler_angles = EpipolarUtils.rotation_matrix_to_euler(R_object_external)
        return external_euler_angles

    def rotate_points(points, center, euler_angles):
        roll, pitch, yaw=euler_angles
        R = EpipolarUtils.euler_to_rotation_matrix([roll, pitch, yaw])
        center = np.array(center).reshape(3, 1)
        rotated_points = []
        for point in points:
            point = np.array(point).reshape(3, 1)
            rotated_point = np.dot(R, point - center) + center
            rotated_points.append(rotated_point.flatten())
        return np.array(rotated_points)
    
    def draw_euler_axes(img, origin, euler_angles, camera_matrix, distortion_coeffs, rvec, tvec, axis_length=0.2):
        axes_points = np.array([
            origin,
            origin + np.array([axis_length, 0, 0]),
            origin + np.array([0, axis_length, 0]),
            origin + np.array([0, 0, axis_length])
        ], dtype=np.float32)
        rotated_axes_points=rotate_points(axes_points, origin, euler_angles)
        points_2D, _ = cv2.projectPoints(rotated_axes_points, rvec, tvec, camera_matrix, distortion_coeffs)
        points_2D = np.round(points_2D).astype(int).reshape(-1, 2)
        axes_edges = [(0, 1), (0, 2), (0, 3)]
        axis_colors = [(0, 0, 255), (0, 255, 0), (255, 0, 0)]
        for i, edge in enumerate(axes_edges):
            pt1, pt2 = points_2D[edge[0]], points_2D[edge[1]]
            cv2.line(img, tuple(pt1), tuple(pt2), axis_colors[i], 2)
dcoccjcz
  • 9
  • 2
  • Hopefully I'm not stating the obvious, but you would probably need some image registration technique that aligns the two cameras together. One approach to consider is looking at how matching features are oriented with respect to your cameras and deriving a transformation matrix from that. Another way which I've personally used is `cv2.findChessboardCorners()`, where you provide a chessboard-type grid in one set of images for calibration and the actual content in another. – youngson May 08 '23 at 09:18
  • @syoung you are right, this is precisely what I mean when I say that I have calibrated my cameras – dcoccjcz May 08 '23 at 10:38

0 Answers0