How to find the location of a point in an image in millimeters using camera matrix?

Question

I am using a standard 640x480 webcam. I have done Camera calibration in OpenCV in Python 3. This the Code I am using. The code is working and giving me the Camera Matrix and Distortion Coefficients successfully. Now, How can I find how many millimeters are there in 640 pixels in my scene image. I have attached the webcam above a table horizontally and on the table, a Robotic arm is placed. Using the camera I am finding the centroid of an object. Using Camera Matrix my goal is to convert the location of that object (e.g. 300x200 pixels) to the millimeter units so that I can give the millimeters to the robotic arm to pick that object. I have searched but not find any relevant information. Please tell me that is there any equation or method for that. Thanks a lot!

import numpy as np
import cv2
import yaml
import os

# Parameters
#TODO : Read from file
n_row=4  #Checkerboard Rows
n_col=6  #Checkerboard Columns
n_min_img = 10 # number of images needed for calibration
square_size = 40  # size of each individual box on Checkerboard in mm  
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001) # termination criteria
corner_accuracy = (11,11)
result_file = "./calibration.yaml"  # Output file having camera matrix

# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(n_row-1,n_col-1,0)
objp = np.zeros((n_row*n_col,3), np.float32)
objp[:,:2] = np.mgrid[0:n_row,0:n_col].T.reshape(-1,2) * square_size

# Intialize camera and window
camera = cv2.VideoCapture(0) #Supposed to be the only camera
if not camera.isOpened():
    print("Camera not found!")
    quit()
width = int(camera.get(cv2.CAP_PROP_FRAME_WIDTH))  
height = int(camera.get(cv2.CAP_PROP_FRAME_HEIGHT))
cv2.namedWindow("Calibration")


# Usage
def usage():
    print("Press on displayed window : \n")
    print("[space]     : take picture")
    print("[c]         : compute calibration")
    print("[r]         : reset program")
    print("[ESC]    : quit")

usage()
Initialization = True

while True:    
    if Initialization:
        print("Initialize data structures ..")
        objpoints = [] # 3d point in real world space
        imgpoints = [] # 2d points in image plane.
        n_img = 0
        Initialization = False
        tot_error=0
    
    # Read from camera and display on windows
    ret, img = camera.read()
    cv2.imshow("Calibration", img)
    if not ret:
        print("Cannot read camera frame, exit from program!")
        camera.release()        
        cv2.destroyAllWindows()
        break
    
    # Wait for instruction 
    k = cv2.waitKey(50) 
   
    # SPACE pressed to take picture
    if k%256 == 32:   
        print("Adding image for calibration...")
        imgGray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

        # Find the chess board corners
        ret, corners = cv2.findChessboardCorners(imgGray, (n_row,n_col),None)

        # If found, add object points, image points (after refining them)
        if not ret:
            print("Cannot found Chessboard corners!")
            
        else:
            print("Chessboard corners successfully found.")
            objpoints.append(objp)
            n_img +=1
            corners2 = cv2.cornerSubPix(imgGray,corners,corner_accuracy,(-1,-1),criteria)
            imgpoints.append(corners2)

            # Draw and display the corners
            imgAugmnt = cv2.drawChessboardCorners(img, (n_row,n_col), corners2,ret)
            cv2.imshow('Calibration',imgAugmnt) 
            cv2.waitKey(500)        
                
    # "c" pressed to compute calibration        
    elif k%256 == 99:        
        if n_img <= n_min_img:
            print("Only ", n_img , " captured, ",  " at least ", n_min_img , " images are needed")
        
        else:
            print("Computing calibration ...")
            ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, (width,height),None,None)
            
            if not ret:
                print("Cannot compute calibration!")
            
            else:
                print("Camera calibration successfully computed")
                # Compute reprojection errors
                for i in range(len(objpoints)):
                   imgpoints2, _ = cv2.projectPoints(objpoints[i], rvecs[i], tvecs[i], mtx, dist)
                   error = cv2.norm(imgpoints[i],imgpoints2, cv2.NORM_L2)/len(imgpoints2)
                   tot_error += error
                print("Camera matrix: ", mtx)
                print("Distortion coeffs: ", dist)
                print("Total error: ", tot_error)
                print("Mean error: ", np.mean(error))
                
                # Saving calibration matrix
                try:
                    os.remove(result_file)  #Delete old file first
                except Exception as e:
                    #print(e)
                    pass
                print("Saving camera matrix .. in ",result_file)
                data={"camera_matrix": mtx.tolist(), "dist_coeff": dist.tolist()}
                with open(result_file, "w") as f:
                    yaml.dump(data, f, default_flow_style=False)
                
    # ESC pressed to quit
    elif k%256 == 27:
            print("Escape hit, closing...")
            camera.release()        
            cv2.destroyAllWindows()
            break
    # "r" pressed to reset
    elif k%256 ==114: 
         print("Reset program...")
         Initialization = True

This the Camera Matrix:

818.6   0     324.4
0      819.1  237.9
0       0      1

Distortion coeffs:

0.34  -5.7  0  0  33.45

You will need to know the dimensions of an object in your camera space, say a ruler or something — DrBwts, Oct 23 '20 at 14:04
Yes, currently I am doing this by measuring the number of mm in an image from one side to another with the help of a ruler or measuring tape. Then finding the mm per pixel. But it is not an accurate method. I want to do it mathematically without error. — Tehseen, Oct 23 '20 at 15:15

score 1 · Accepted Answer · edited Nov 22 '22 at 23:00

1

I was actually thinking that you should be able to solve your problem in a simple way:

mm_per_pixel = real_mm_width : 640px

Assuming that the camera initially moves in parallel to the plan with the object to pick [i.e. fixed distance], real_mm_width can be found measuring the physical distance corresponding to those 640 pixels of your picture. For the sake of an example say that you find that real_mm_width = 32cm = 320mm, so you get mm_per_pixel = 0.5mm/px. With a fixed distance this ratio doesn't change

It seems also the suggestion from the official documentation:

This consideration helps us to find only X,Y values. Now for X,Y values, we can simply pass the points as (0,0), (1,0), (2,0), ... which denotes the location of points. In this case, the results we get will be in the scale of size of chess board square. But if we know the square size, (say 30 mm), we can pass the values as (0,0), (30,0), (60,0), ... . Thus, we get the results in mm

Then you just need to convert the centroid coordinates in pixels [e.g. (pixel_x_centroid, pixel_y_centroid) = (300px, 200px)] to mm using:

mm_x_centroid = pixel_x_centroid * mm_per_pixel
mm_y_centroid = pixel_y_centroid * mm_per_pixel

which would give you the final answer:

(mm_x_centroid, mm_y_centroid) = (150mm, 100mm)

Another way to see the same thing is this proportion where the first member is the measurable/known ratio:

real_mm_width : 640px = mm_x_centroid : pixel_x_centroid = mm_y_centroid = pixel_y_centroid

edited Nov 22 '22 at 23:00

Dharman

30,962
25
85
135

answered Oct 23 '20 at 14:06

Antonino

3,178
3
24
39

Thanks for your well-defined answer, I will try it. I find another solution here: https://stackoverflow.com/questions/12007775/to-calculate-world-coordinates-from-screen-coordinates-with-opencv Do you think the method in the link is good? – Tehseen Oct 24 '20 at 08:17
@Tehseen no worries! even the solution you linked actually won't help you retrieve the mm values without physically measuring. All the calculations there are based on the theory brilliantly exposed in its [first link](http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/OWENS/LECT9/node2.html). In my answer I assumed that the centroid coordinates `(300px, 200px)` are representing values on the image plane in the `(u, v)` system [i.e. origin in the top left corner of the image]. Is this true or your centroid coordinates are values in the `(uc, vc)` = `(cx, cy)` [projection center] system? – Antonino Oct 24 '20 at 13:00
I am measuring the coordinates from the top-left corner of the image. – Tehseen Oct 24 '20 at 13:53
I came to know that using just Intrinsic parameters I cannot find the world coordinates from image coordinates unless I have the dept information. But if I have both Intrinsic and Extrinsic parameters then I have everything and I can perform reprojection from image coordinates to world coordinates without the help of some external depth measurement. Is it true? If so then how can I do this? Thanks! Reference Link: https://stackoverflow.com/a/10750648/3661547 – Tehseen Oct 24 '20 at 13:57

How to find the location of a point in an image in millimeters using camera matrix?

1 Answers1