4

I am trying to compute Object Keypoint similarity to evaluate the keypoint detection of an algorithm. below is code I've written based on what I've found and understood from here

oks formula

def oks(gt, preds, threshold, v, gt_area):
    ious = np.zeros((len(preds), len(gt)))
    sigmas = np.array([.26, .25, .25, .35, .35, .79, .79, .72, .72, .62,.62, 1.07, 1.07, .87, .87, .89, .89])/10.0
    vars = (sigmas*2)**2
    k = len(sigmas)

    xg = gt[:, 0]; yg = gt[:, 1]
    xp = preds[:, 0]; yp = preds[:, 1]
    vg = v + 1 # add one to visibility tags
    k1 = np.count_nonzero(vg > 0)
    dx = np.subtract(xg, xp)
    dy = np.subtract(yg, yp)

    e = (dx**2+dy**2)/vars/(gt_area+np.spacing(1))/2
    if threshold is not None:
        ind = list(vg > threshold)
        e = e[ind]
    ious = np.sum(np.exp(-e))/(1.5*e.shape[0]) if len(e) != 0 else 0
    return ious

where,

gt, preds are 17x2 NumPy arrays containing 17 (x, y) coordinates of human pose for ground truth and prediction from the machine learning model respectively.

threshold = 0.5(coco dataset uses 0.5 as a soft threshold),

v = visibility of ground truth keypoints(17x1 NumPy array) with values 0 = visible and 1 = occluded( thus we do vg=v+1 to comply with oks formula)

gt_area = area of the bounding box for ground truth person.

I was under the impression that oks is supposed to yield a value for each of the keypoints but the above code leads to a single value for all the keypoints combined. Am i doing something wrong here?

Christoph Rackwitz
  • 11,317
  • 4
  • 27
  • 36
krishna
  • 405
  • 6
  • 25

1 Answers1

2

As defined on the COCO Dataset Website - Evaluate section:

For each object, ground truth keypoints have the form [x1,y1,v1,...,xk,yk,vk], where x,y are the keypoint locations and v is a visibility flag defined as v=0: not labeled, v=1: labeled but not visible, and v=2: labeled and visible.

The OKS metric is not computed per keypoint, it is a relative metric computed for each sample (body in the case of Human Pose Estimation).

Object Keypoint Similarity metric

On OKS the sum is done over all sample's keypoints and the visibility flag is used as a dirac (i.e if the keypoint is labeled \delta(v_i > 0) is 1 else 0 without consideration of occluded keypoints)

import numpy as np


def oks(y_true, y_pred, visibility):
    # You might want to set these global constant
    # outside the function scope
    KAPPA = np.array([1] * len(y_true))
    # The object scale
    # You might need a dynamic value for the object scale
    SCALE = 1.0

    # Compute the L2/Euclidean Distance
    distances = np.linalg.norm(y_pred - y_true, axis=-1)
    # Compute the exponential part of the equation
    exp_vector = np.exp(-(distances**2) / (2 * (SCALE**2) * (KAPPA**2)))
    # The numerator expression
    numerator = np.dot(exp_vector, visibility.astype(bool).astype(int))
    # The denominator expression
    denominator = np.sum(visibility.astype(bool).astype(int))
    return numerator / denominator


if __name__ == "__main__":
    IMAGE_SIZE_IN_PIXEL = 50
    gt = (np.random.random((17, 2)) * IMAGE_SIZE_IN_PIXEL).astype(int)
    pred = (np.random.random((17, 2)) * IMAGE_SIZE_IN_PIXEL).astype(int)
    visibility = (np.random.random((17, 1)) * 3).astype(int)

    # On this example the value will not be correct
    # since you need to calibrate KAPPA and SCALE
    print("OKS", oks(gt, pred, visibility))

import numpy as np


def oks(y_true, y_pred, visibility):
    # You might want to set these global constant
    # outside the function scope
    KAPPA = np.array([1] * len(y_true))
    # The object scale
    # You might need a dynamic value for the object scale
    SCALE = 1.0

    # Compute the L2/Euclidean Distance
    distances = np.linalg.norm(y_pred - y_true, axis=-1)
    # Compute the exponential part of the equation
    exp_vector = np.exp(-(distances**2) / (2 * (SCALE**2) * (KAPPA**2)))
    # The numerator expression
    numerator = np.dot(exp_vector, visibility.astype(bool).astype(int))
    # The denominator expression
    denominator = np.sum(visibility.astype(bool).astype(int))
    return numerator / denominator


if __name__ == "__main__":
    IMAGE_SIZE_IN_PIXEL = 50
    gt = (np.random.random((17, 2)) * IMAGE_SIZE_IN_PIXEL).astype(int)
    pred = (np.random.random((17, 2)) * IMAGE_SIZE_IN_PIXEL).astype(int)
    visibility = (np.random.random((17, 1)) * 3).astype(int)

    # On this example the value will not be correct
    # since you need to calibrate KAPPA and SCALE
    print("OKS", oks(gt, pred, visibility))
<script src="https://cdn.jsdelivr.net/gh/pysnippet/pysnippet@latest/snippet.min.js"></script>
Waligoo
  • 105
  • 1
  • 8