Object Pose solvePnP

Question

Unable to get object Pose and draw axis with 4 markers

I am trying to get the object pose by following This tutorial for Pose Estimation. In the video the author uses chessboard pattern(24,17) and mentions in the comment that any object with markers(detectable) can be used to estimate the pose. I am using this ~~Object with only 4 circular markers~~ I am able to detect the markers and get the (x,y) points(ImagePoints) and ObjectPoint with an arbitrary ref. I have my camera calibrated(CameraMatrix and Distortion Coefficients). Following the tutorial i am unable to draw Object Frame. This is what i was able to do so far.

#(x,y) points of detected markers, another function processes the image and returns the 4 points
Points = [(x1,y1),(x2,y2),(x3,y3),(x4,y5)] 

image_points  = np.array([
        Points[0],
        Points[1],
        Points[2],
        Points[3]
    ],dtype=np.float32)

image_points = np.ascontiguousarray(image_points).reshape((4,1,2))
    
object_points  = np.array([
        (80,56,0),
        (80,72,0),
        (57,72,0),
        (57,88,0)],
        dtype=np.float32).reshape((4,1,3)) #Set Z as 0

axis = np.float32([[5,0,0], [0,5,0], [0,0,-5]]).reshape(-1,3)

imgpts, jac = cv2.projectPoints(axis, rotation_vector, translation_vector, mtx, dist)

What am i missing?

This is what i am trying to acheive. Goal

~~This is the current result Current~~

Camera Distance from the object is fixed. I need to track Translation and Rotation in x and y.

EDIT:

Sample Image markings

Updated Object

Updated Result

Pixel Values of markers from Top-Left to bottom-right

Point_A = (1081, 544) 
Point_B = (1090, 782) 
Point_C = (824, 785)  #Preferred Origin Point
Point_D = (826, 1050)

Camera Parameters

mtx: [[2.34613584e+03 0.00000000e+00 1.24680613e+03]
      [0.00000000e+00 2.34637787e+03 1.11379469e+03]
      [0.00000000e+00 0.00000000e+00 1.00000000e+00]]
dist: 
     [[-0.05595266  0.07570472  0.00200983  0.00073797 -0.30768105]]

Python Code

Your object points don't make much sense. If the markers are what I expect the to be your object points should be a multiple of (0,-1,0),(0,0,0),(1,0,0),(1,1,0). Can you explain more why you have chosen the values as they are and can you show the image points as well? — Micka, Feb 06 '23 at 16:50
@Micka Here are the detected [ImagePoints](https://i.imgur.com/fTpGynq.png). For object points i kept a world frame with Z as 0 for the points [Like this](https://i.imgur.com/Vyfn6g1.jpg) and printed it on A4 paper. I assume this is not how you get the Object Points — AqibF, Feb 06 '23 at 21:13
Your drawing is ok, but your ObjectPoint values differ from the values in your drawing. If you want one of the points ro be (0,0,0) you have to subtract its previous values from all the points. — Micka, Feb 06 '23 at 21:35
In addition, your markers all look the same, so it might be posdible that the ordering of object points snd image points differs. Try to use markers that csn be uniwuely identified in the image/detection. — Micka, Feb 06 '23 at 21:36
@Micka So i made some changes in the markers,the shape is the same but with different diameter,[Updated](https://i.imgur.com/sa81m2F.png) now they can be individually identified by their diameter. I kept the smallest one as Origin. Can you explain about the subtraction part ? — AqibF, Feb 07 '23 at 09:50
let's call the markers A,B,C,D from top-right to bottom-left. If you want marker C to be (0,0,0) your new object coordinates are A'=A-C; B'=B-C; C'=C-C=(0,0,0); D'=D-C. So in your case A'=(23,-16,0); B'=(23,0,0); C'=(0,0,0); D'=(0,16,0); This is a shift oft he local coordinate system and then its origin will be at marker C. Also make sure that the ordering of image points and object points is identical, so that imagePoint[0] is the image position of objectPoints[0] etc. Then solvePnp can and should work. — Micka, Feb 07 '23 at 09:59
btw, in your code, the object points are: `(50,53,0), (49,59,0), (0,0,0), (41,66,0)` can you explain why you've chosen them like that? The values completely differ from your drawing... — Micka, Feb 07 '23 at 10:02
Before i printed on the paper, i just used some arbitrary values. I have updated the code since then. — AqibF, Feb 07 '23 at 10:38
The result image is from your old version? If not, the origin of the axis doesnt make sense. Can you print and add the values of the detected markers image pixel positions to your posting? In the ordering of the list. — Micka, Feb 07 '23 at 11:15
@Micka i have update my post and added updated image and markers with pixel values, i apologise for the confusion, this is first time i'm posting on stackoverflow. :) — AqibF, Feb 07 '23 at 12:21
in the `image_points` array, are the detected markers in the same ordering? So `image_points[0] = Point_A`, `image_points[1] = Point_B`, etc.? That's very important for solvePnp, because image points and objects points must match on their indices! — Micka, Feb 07 '23 at 12:58
Yes , i double checked. The points are matching with indices — AqibF, Feb 07 '23 at 13:16
can you share your camera matrix and distortion coefficients? Is the "Sample image markings" the actual image? The image-points pixel positions dont match with that image. I tested a bit and inaccuracies affect solvePnp results quite strongly. — Micka, Feb 07 '23 at 13:32
Thanks for providing C++ solution. I have put Camera Matrix in the post. Going through the code i think my problem is with the Camera Parameters. — AqibF, Feb 07 '23 at 15:39
@Micka Yes "Sample Images Markings" is the actual image taken from the camera. — AqibF, Feb 07 '23 at 16:20
how did you compute (and verify/test) the camera parameters? — Micka, Feb 07 '23 at 17:26
I used [this](https://gist.github.com/AqibFarooq/368d462570e1e142ee8d7c4866a059d9) to compute the camera parameters and validate afterwards by calculating the reprojection error (total error after calibration is: 0.11154298464651628) — AqibF, Feb 07 '23 at 20:00

score 1 · Accepted Answer · answered Feb 07 '23 at 13:39

Here's an example in C++ with your image and object, but I extracted the image points again (because they didnt fit to the provided values) and I used a pinhole camera (no distortion). Results should be similar/better if you use the actual camera parameters.

int main()
{
    cv::Mat img = cv::imread("C:/data/StackOverflow/solvePnp.png");

    // assuming a pinhole camera, because the actual intrinsics/distortion is unknown.
    cv::Mat intrinsics = cv::Mat::eye(3,3,CV_64FC1);
    intrinsics.at<double>(0, 2) = img.cols / 2.0;
    intrinsics.at<double>(1, 2) = img.rows / 2.0;
    intrinsics.at<double>(0, 0) = 1000;
    intrinsics.at<double>(1, 1) = 1000;
    
    // assumed: no distortion.
    std::vector<double> distCoeffs;
    distCoeffs.resize(4, 0);

    std::vector<cv::Point3f> objPoints;
    // provided object points from
    objPoints.push_back({80.0f,56.0f,0.0f});
    objPoints.push_back({ 80.0f,72.0f, 0.0f });
    objPoints.push_back({ 57.0f,72.0f, 0.0f });
    objPoints.push_back({ 57.0f,88.0f, 0.0f });
    
    // we want the third point to be the origin of the object, so we have to shift the coordinate system:
    cv::Point3f origin = objPoints[2];

    for (int i = 0; i < objPoints.size(); ++i)
    {
        objPoints[i] = objPoints[i] - origin;
    }

    std::vector<cv::Point2f> imgPoints;

    /*
    // WRONT PROVIDED VALUES!
    imgPoints.push_back({ 1081, 544 });
    imgPoints.push_back({ 1090, 782 });
    imgPoints.push_back({ 824, 785 });
    imgPoints.push_back({ 826, 1050 });
    */

    // image points read from the image
    imgPoints.push_back({ 1123, 558 });
    imgPoints.push_back({ 1132, 814 });
    imgPoints.push_back({ 851, 818 });
    imgPoints.push_back({ 852, 1097 });


    cv::Vec3d rot;
    cv::Vec3d trans;
    cv::solvePnP(objPoints, imgPoints, intrinsics, distCoeffs, rot, trans);

    std::vector<cv::Point3f> axis;
    axis.push_back({ 0,0,0 });
    axis.push_back({ 10,0,0 });
    axis.push_back({ 0,10,0 });
    axis.push_back({ 0,0,10 });

    std::vector<cv::Point2f> axisImg;

    cv::projectPoints(axis, rot, trans, intrinsics, distCoeffs, axisImg);

    cv::line(img, axisImg[0], axisImg[1], cv::Scalar(0, 0, 255),5);
    cv::line(img, axisImg[0], axisImg[2], cv::Scalar(0, 255, 0),5);
    cv::line(img, axisImg[0], axisImg[3], cv::Scalar(255, 0, 0),5);

    std::cout << axisImg[0] << std::endl;
    std::cout << axisImg[1] << std::endl;
    std::cout << axisImg[2] << std::endl;
    std::cout << axisImg[3] << std::endl;

    for (int i = 0; i < imgPoints.size(); ++i)
    {
        cv::circle(img, imgPoints[i], 5, cv::Scalar(i * 255, (i == 0) ? 255 : 0, i * 50));
    }

    cv::imwrite("C:/data/StackOverflow/solvePnp_RESULT.png", img);
    cv::resize(img, img, cv::Size(), 0.25, 0.25);
    cv::imshow("img", img);

    cv::waitKey(0);
}

red = X green = Y blue = Z

I have marked it as the best answer. I fixed some mistakes and after using pinhole camera parameters. I get these Results. [Point A](https://i.imgur.com/4DW2GRM.png) [Point B](https://i.imgur.com/uIvECNj.png) [Point C](https://i.imgur.com/T4ainID.png) [Point D](https://i.imgur.com/h2pu7OZ.png) — AqibF, Feb 07 '23 at 16:06

Object Pose solvePnP

1 Answers1