5

I am using Python with OpenCV 3.4.

I have a system composed of 2 cameras that I want to use to track an object and get its trajectory, then its speed.

I am currently able to calibrate intrinsically and extrinsically each of my cameras. I can track my object through the video and get the 2d coordinates in my video plan.

My problem now is that I would like to project my points from my both 2D plan into 3D points. I've tried functions as triangulatePoints but it seems it's not working in a proper way. Here is my actual function to get 3d coords. It returns some coordinates that seems a little bit off compared to the actual coordinates

def get_3d_coord(left_two_d_coords, right_two_d_coords):

    pt1 = left_two_d_coords.reshape((len(left_two_d_coords), 1, 2))
    pt2 = right_two_d_coords.reshape((len(right_two_d_coords), 1, 2))

    extrinsic_left_camera_matrix, left_distortion_coeffs, extrinsic_left_rotation_vector, \
        extrinsic_left_translation_vector = trajectory_utils.get_extrinsic_parameters(
            1)

    extrinsic_right_camera_matrix, right_distortion_coeffs, extrinsic_right_rotation_vector, \
        extrinsic_right_translation_vector = trajectory_utils.get_extrinsic_parameters(
            2)

    #returns arrays of the same size
    (pt1, pt2) = correspondingPoints(pt1, pt2)



    projection1 = computeProjMat(extrinsic_left_camera_matrix,
                                    extrinsic_left_rotation_vector, extrinsic_left_translation_vector)
    projection2 = computeProjMat(extrinsic_right_camera_matrix,
                                    extrinsic_right_rotation_vector, extrinsic_right_translation_vector)

    out = cv2.triangulatePoints(projection1, projection2, pt1, pt2)

    oc = []
    for idx, elem in enumerate(out[0]):
        oc.append((out[0][idx], out[1][idx], out[2][idx], out[3][idx]))

    oc = np.array(oc, dtype=np.float32)

    point3D = []

    for idx, elem in enumerate(oc):
        W = out[3][idx]
        obj = [None] * 4
        obj[0] = out[0][idx] / W
        obj[1] = out[1][idx] / W
        obj[2] = out[2][idx] / W
        obj[3] = 1

        pt3d = [obj[0], obj[1], obj[2]]
        point3D.append(pt3d)

    return point3D

Here are some screenshot of the 2d trajectory that I get for both my cameras : 2d trajectory for the 1st camera 2d trajectory for the 2nd camera

Here are some screenshot of the 3d trajectory that we get for the same camera. 3d trajectory for the 1st camera 3d trajectory for the 2nd camera

As you can see the 2d trajectory doesn't look as the 3d one, and I am not able to get a accurate distance between two points. I just would like getting real coordinates, it means knowing the (almost) exact real distance walked by a person even in a curved road.

EDIT to add reference data and examples

Here is some example and input data to reproduce the problem. First, here are some data. 2D points for camera1

546,357 
646,351 
767,357 
879,353 
986,360 
1079,365
1152,364

corresponding 2D for camera2

236,305
313,302
414,308
532,308
647,314
752,320
851,323

3D points that we get from triangulatePoints

"[0.15245444, 0.30141047, 0.5444277]"
"[0.33479974, 0.6477136, 0.25396818]"
"[0.6559921, 1.0416716, -0.2717265]"
"[1.1381898, 1.5703914, -0.87318224]"
"[1.7568599, 1.9649554, -1.5008119]"
"[2.406788, 2.302272, -2.0778883]"
"[3.078426, 2.6655817, -2.6113863]"

In these following images, we can see the 2d trajectory (top line) and the 3d projection reprojected in 2d (bottom line). Colors are alternating to show which 3d points correspond to 2d point.

camera1 camera2

And finally here are some data to reproduce.

camera 1 : camera matrix

5.462001610064596662e+02 0.000000000000000000e+00 6.382260289544193483e+02
0.000000000000000000e+00 5.195528638702176067e+02 3.722480290221320161e+02
0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00

camera 2 : camera matrix

4.302353276501239066e+02 0.000000000000000000e+00 6.442674231451971991e+02
0.000000000000000000e+00 4.064124751062329324e+02 3.730721752718034736e+02
0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00

camera 1 : distortion vector

-1.039009381799949928e-02 -6.875769941694849507e-02 5.573643708806085006e-02 -7.298826373638074051e-04 2.195279856716004369e-02

camera 2 : distortion vector

-8.089289768586239993e-02 6.376634681503455396e-04 2.803641672679824115e-02 7.852965318823987989e-03 1.390248981867302919e-03

camera 1 : rotation vector

1.643658457134109296e+00
-9.626823326237364531e-02
1.019865700311696488e-01

camera 2 : rotation vector

1.698451227150894471e+00
-4.734769748661146055e-02
5.868343803315514279e-02

camera 1 : translation vector

-5.004031689969588026e-01
9.358682517577661120e-01
2.317689087311113116e+00

camera 2 : translation vector

-4.225788801112133619e+00
9.519952012307866251e-01
2.419197507326224184e+00

camera 1 : object points

0 0 0   
0 3 0   
0.5 0 0 
0.5 3 0 
1 0 0   
1 3 0   
1.5 0 0 
1.5 3 0 
2 0 0   
2 3 0  

camera 2 : object points

4 0 0   
4 3 0   
4.5 0 0 
4.5 3 0 
5 0 0   
5 3 0   
5.5 0 0 
5.5 3 0 
6 0 0   
6 3 0  

camera 1 : image points

5.180000000000000000e+02 5.920000000000000000e+02
5.480000000000000000e+02 4.410000000000000000e+02
6.360000000000000000e+02 5.910000000000000000e+02
6.020000000000000000e+02 4.420000000000000000e+02
7.520000000000000000e+02 5.860000000000000000e+02
6.500000000000000000e+02 4.430000000000000000e+02
8.620000000000000000e+02 5.770000000000000000e+02
7.000000000000000000e+02 4.430000000000000000e+02
9.600000000000000000e+02 5.670000000000000000e+02
7.460000000000000000e+02 4.430000000000000000e+02

camera 2 : image points

6.080000000000000000e+02 5.210000000000000000e+02
6.080000000000000000e+02 4.130000000000000000e+02
7.020000000000000000e+02 5.250000000000000000e+02
6.560000000000000000e+02 4.140000000000000000e+02
7.650000000000000000e+02 5.210000000000000000e+02
6.840000000000000000e+02 4.150000000000000000e+02
8.400000000000000000e+02 5.190000000000000000e+02
7.260000000000000000e+02 4.160000000000000000e+02
9.120000000000000000e+02 5.140000000000000000e+02
7.600000000000000000e+02 4.170000000000000000e+02
Q. Eude
  • 841
  • 11
  • 24
  • Please provide your test samples (images) and expected / returned results if possible. – Oliort UA May 02 '19 at 13:20
  • @Oliort I added some images to illustrate my problem, tell me if you need more. – Q. Eude May 02 '19 at 14:56
  • Describe please how you draw the 3d trajectory on the 2d images you added. Do you project the received 3d points using the same projection parameters right back on 2d images you calculated them from? Are point set sizes equal for both images? It seams that on the second image the trajectory is longer. So does some part of second image trajectory consists of points that have no corresponding positions on the first image? – Oliort UA May 02 '19 at 16:34
  • It seems to me that the left part of projected 3d trajectory (if it is done so) is quite nice for both 3d results. Isn't it? – Oliort UA May 02 '19 at 16:40
  • @Oliort The first two images are simply the 2D points we collect that are drawn on the picture. The last 2 images are the projections of the 3D points. To reproject our 3D points, we multiply them by the camera's projection matrice, and then divide the x and y given by the z. You are right, our fist set of points look correct, however it is not the case all the time. I have doubts about the liability of our solution. What is your opinion on it? How would you go about solving this problem ? – Q. Eude May 03 '19 at 08:05
  • It seems to me that some points on the second image overflow the first image view area. I think that may be the reason for incorrect results. Check the points' pairs for conformity on both images and use only points which are inside the overlap view area of both images. Could you please attach incorrect results you get for a single pair of points (instead of a whole trajectory), if this is not hte case. – Oliort UA May 05 '19 at 12:21

1 Answers1

0

Assuming both your resolutions are 1280x720 I calculated the left camera rotation and translation.

left_obj = np.array([[
        [0, 0, 0],   
        [0, 3, 0],   
        [0.5, 0, 0], 
        [0.5, 3, 0], 
        [1, 0, 0],  
        [1 ,3, 0], 
        [1.5, 0, 0], 
        [1.5, 3, 0], 
        [2, 0, 0],   
        [2, 3, 0] 
    ]], dtype=np.float32)

left_img = np.array([[
        [5.180000000000000000e+02, 5.920000000000000000e+02],
        [5.480000000000000000e+02, 4.410000000000000000e+02],
        [6.360000000000000000e+02, 5.910000000000000000e+02],
        [6.020000000000000000e+02, 4.420000000000000000e+02],
        [7.520000000000000000e+02, 5.860000000000000000e+02],
        [6.500000000000000000e+02, 4.430000000000000000e+02],
        [8.620000000000000000e+02, 5.770000000000000000e+02],
        [7.000000000000000000e+02, 4.430000000000000000e+02],
        [9.600000000000000000e+02, 5.670000000000000000e+02],
        [7.460000000000000000e+02, 4.430000000000000000e+02]
    ]], dtype=np.float32)
    
left_camera_matrix = np.array([
        [4.777926320579549042e+02, 0.000000000000000000e+00, 5.609694925007885331e+02],
        [0.000000000000000000e+00, 2.687583555325996372e+02, 5.712247987054799978e+02],
        [0.000000000000000000e+00, 0.000000000000000000e+00, 1.000000000000000000e+00]
    ])

    
left_distortion_coeffs = np.array([
        -8.332059138465927606e-02,
        -1.402986394998156472e+00,
        2.843132503678651168e-02, 
        7.633417606366312003e-02, 
        1.191317644548635979e+00
    ])

ret, left_camera_matrix, left_distortion_coeffs, rot, trans = cv2.calibrateCamera(left_obj, left_img, (1280, 720),
            left_camera_matrix, left_distortion_coeffs, None, None, cv2.CALIB_USE_INTRINSIC_GUESS)
print(rot[0])
print(trans[0])

I got different results:

[[ 2.7262137 ] [-0.19060341] [-0.30345874]]

[[-0.48068581] [ 0.75257108] [ 1.80413094]]

The same for right camera:

[[ 2.1952522 ] [ 0.20281459] [-0.46649734]]

[[-2.96484428] [-0.0906817 ] [ 3.84203022]]

You can check rotations approximately this way: calculate relative rotation between computed results and compare against relative rotation between real camera positions. Translations: calculate relative normalized translation vector between computed results and compare against normalized relative translation between real camera positions. What coordinate system OpenCV uses is depicted here .

Community
  • 1
  • 1
Oliort UA
  • 1,568
  • 1
  • 14
  • 31
  • We are currently using that process, every point we take for the 3D projection has a corresponding point in both cameras. We basically synchronize both videos then put together 2d point that are at the same frame number. Right after that we apply `triangulatePoints` for both our 2D trajectories. – Q. Eude May 09 '19 at 07:48
  • @Q.Eude Quoting from my other comment "Could you please attach incorrect results you get for a single pair of points (instead of a whole trajectory), if this is not the case". Please make sparser (less point density, less points) trajectory and paint the points with different colors (same point in time - same color on both images, diiferent time - different point color). I am not able to reproduce your results, so I need more detailed data to be able to help. – Oliort UA May 09 '19 at 12:18
  • i've made some changes, you can find some input data and a little bit more explanations. – Q. Eude May 10 '19 at 09:36
  • I did calculate my intrinsic matrix with a chessboard foreach camera. My image points and objects points are only used for the extrinsic calibration. – Q. Eude May 13 '19 at 09:55
  • How did you calculated these , I've tried using both `calibrateCamera` and `solvePnpRansac`. And is there a way to check if these matrix are correct ? – Q. Eude May 13 '19 at 12:12
  • @Q.Eude look at the updated answer please. I used `calibrateCamera` (as in my code). – Oliort UA May 13 '19 at 12:28
  • look at the update, I've made some changes in camera matrix and rotation matrix. It was maybe a fail. I use calibrate camera with a chessboard to get intrinsic parameters then `solvePnPRansac` to get extrinsic parameters – Q. Eude May 13 '19 at 14:53