1

Hello I would like to find the 3D position of an object given 2 different views of an object.

Things that I can provide here are:

  • I can calculate the intrinsic matrix of each camera.
  • I also know the 2D coordinates of the objects see here.
  • Providing bounding boxes of the object

Things that I can not provide here are:

  • 3D position or relative position of the 2 cameras.
  • 3D position of the object.
  • Measurements of the object.

These are methods i may be able to use to obtain center coordinates corresponding to the camera and the intrinsic parameters.

# This function uses a custom trained fasterrcnn model to detect the object and
# the center of the objects is being calculated using the bounding boxes.
# For simplicity the centers are being hardcoded, since the object won't move
def calculateCenterAndBoundingBox(image):
    ...
    boundingBox1 = [(715.329, 383.64413), (746.09143, 402.87524)]
    boudingBox2 = [(303.78778, 391.57953), (339.4821, 412.69092)]
    if image == 1:
       return (730.7102, 393.2597), boundingBox1  
    else
       return (321.63495, 402.13522), boudingBox2 

#for simplicity reasons, both intrinsic cameras are the same
def calculateIntrinsic():
    ...
    return [[512, 0.0,         512],
            [0.0, 483.0443151, 364],
            [0.0, 0.0,         1.0]]

I tried to determine the position of my object with 8-point-algorithm so I decided to create some feature keypoints with SIFT using this Implementation .

%matplotlib inline
import matplotlib.pylab as plt
import numpy as np
import pysift
import math
import cv2

def myPlot(img):
    plt.figure(figsize=(15,20)) # display the output image
    plt.imshow(img)
    plt.xticks([])
    plt.yticks([])
    plt.show()

pathToImage1 = "testImage1.png"
c1, bb1 = calculateCenterAndBoundingBox(1)
originalImage1 = cv2.imread(pathToImage1) 
img1 = cv2.imread(pathToImage1, 0)

originalImage1 = originalImage[math.floor(bb1[0][1]): math.floor(bb1[1][1]), math.floor(bb1[0][0]):math.floor(bb1[1][0])]
img1 = img1[math.floor(bb1[0][1]): math.floor(bb1[1][1]), math.floor(bb1[0][0]):math.floor(bb1[1][0])]

keypoints, descriptors = pysift.computeKeypointsAndDescriptors(img1)
img1=cv2.drawKeypoints(img1,keypoints,originalImage1)
myplot(img1)


pathToImage2 = "testImage2.png"
c2, bb2 = calculateCenterAndBoundingBox(2)
originalImage2 = cv2.imread(pathToImage2) 
img2 = cv2.imread(pathToImage2 , 0) 

originalImage2 = originalImage2[math.floor(bb2 [0][1]): math.floor(bb2 [1][1]), math.floor(bb2 [0][0]):math.floor(bb2 [1][0])]
img2 = img2[math.floor(bb2 [0][1]): math.floor(bb2 [1][1]), math.floor(bb2 [0][0]):math.floor(bb2 [1][0])]

keypoints, descriptors = pysift.computeKeypointsAndDescriptors(img2)
img2=cv2.drawKeypoints(img2,keypoints,originalImage2)
myPlot(img2)

However I only got 1 feature keypoint instead of 8 or more

So appearently I can't use the 8-point algorithm in this case. But I have no other Ideas how to solve this problem giving the constraints above.

Is it even possible to calculate the 3D position given only 2D points and intrinsic matrix of each camera?

Komanechi
  • 11
  • 2
  • Welcome. Please share the relevant code so people can help you with that. See also https://stackoverflow.com/help/minimal-reproducible-example – Ronald Jul 03 '20 at 13:34
  • I think try to simplify your question. "Why does SIFT only return one keypoint?". This may be more tractable for generating help. – gnodab Jul 03 '20 at 15:45
  • I think SIFT might not work, because of the smooth surface of the object, but nevertheless my main problem is to locate the 3D position of my object. I only descibed my attempt to do so with SIFT. In other words, as long as someone does know a solution or has an idea how to locate the 3D position given the constraints i will gratefully try it out. Of course if someone can identify the problem with SIFT and provide a good solution also works. But since it isn't my implementation of SIFT I don't really know what i could change. – Komanechi Jul 03 '20 at 16:33

0 Answers0