4

I'm trying to read the source code of Tensorflow non maximum suppression method in this line. It is imported from gen_image_ops file, but I can't find the file anywhere in the tensorflow source code.

Is there any source that I can reach this method's code?

mkocabas
  • 703
  • 6
  • 19
  • I think the original code is here. https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/non_max_suppression_op.cc – Jamie.C Mar 27 '19 at 04:49
  • The nms algorithm used in object detection API is located in the core/post_processing.py file. https://github.com/tensorflow/models/blob/master/research/object_detection/core/post_processing.py – Long Hoang Nguyen Feb 23 '18 at 13:20

2 Answers2

5

Here you go

Whenever you see an "import gen_*" line in their python op definitions, they're importing an automatically generated python module with bindings to the c++ implementation of the op. If you build from source, the generation will occur at that time. If you're downloading a pip module or some other prebuilt version, the generation is already completed and you're just referencing the compiled libraries.

Alexander Heath
  • 116
  • 1
  • 4
4

I have tried digging around their repo as well with no luck so I ended up just getting the code from my editor.

I used PyCharm so I simply did from tensorflow.python.ops.gen_image_ops then clicked on it to get the code.

I've added both versions of it so here you go.

First Version

def _non_max_suppression(boxes, scores, max_output_size, iou_threshold=0.5, name=None):
  r"""Greedily selects a subset of bounding boxes in descending order of score,

  pruning away boxes that have high intersection-over-union (IOU) overlap
  with previously selected boxes.  Bounding boxes are supplied as
  [y1, x1, y2, x2], where (y1, x1) and (y2, x2) are the coordinates of any
  diagonal pair of box corners and the coordinates can be provided as normalized
  (i.e., lying in the interval [0, 1]) or absolute.  Note that this algorithm
  is agnostic to where the origin is in the coordinate system.  Note that this
  algorithm is invariant to orthogonal transformations and translations
  of the coordinate system; thus translating or reflections of the coordinate
  system result in the same boxes being selected by the algorithm.
  The output of this operation is a set of integers indexing into the input
  collection of bounding boxes representing the selected boxes.  The bounding
  box coordinates corresponding to the selected indices can then be obtained
  using the `tf.gather operation`.  For example:
    selected_indices = tf.image.non_max_suppression(
        boxes, scores, max_output_size, iou_threshold)
    selected_boxes = tf.gather(boxes, selected_indices)

  Args:
    boxes: A `Tensor` of type `float32`.
      A 2-D float tensor of shape `[num_boxes, 4]`.
    scores: A `Tensor` of type `float32`.
      A 1-D float tensor of shape `[num_boxes]` representing a single
      score corresponding to each box (each row of boxes).
    max_output_size: A `Tensor` of type `int32`.
      A scalar integer tensor representing the maximum number of
      boxes to be selected by non max suppression.
    iou_threshold: An optional `float`. Defaults to `0.5`.
      A float representing the threshold for deciding whether boxes
      overlap too much with respect to IOU.
    name: A name for the operation (optional).

  Returns:
    A `Tensor` of type `int32`.
    A 1-D integer tensor of shape `[M]` representing the selected
    indices from the boxes tensor, where `M <= max_output_size`.
  """
  if iou_threshold is None:
    iou_threshold = 0.5
  iou_threshold = _execute.make_float(iou_threshold, "iou_threshold")
  _ctx = _context.context()
  if _ctx.in_graph_mode():
    _, _, _op = _op_def_lib._apply_op_helper(
        "NonMaxSuppression", boxes=boxes, scores=scores,
        max_output_size=max_output_size, iou_threshold=iou_threshold,
        name=name)
    _result = _op.outputs[:]
    _inputs_flat = _op.inputs
    _attrs = ("iou_threshold", _op.get_attr("iou_threshold"))
  else:
    boxes = _ops.convert_to_tensor(boxes, _dtypes.float32)
    scores = _ops.convert_to_tensor(scores, _dtypes.float32)
    max_output_size = _ops.convert_to_tensor(max_output_size, _dtypes.int32)
    _inputs_flat = [boxes, scores, max_output_size]
    _attrs = ("iou_threshold", iou_threshold)
    _result = _execute.execute(b"NonMaxSuppression", 1, inputs=_inputs_flat,
                               attrs=_attrs, ctx=_ctx, name=name)
  _execute.record_gradient(
      "NonMaxSuppression", _inputs_flat, _attrs, _result, name)
  _result, = _result
  return _result

Second version

def _non_max_suppression_v2(boxes, scores, max_output_size, iou_threshold, name=None):
  r"""Greedily selects a subset of bounding boxes in descending order of score,

  pruning away boxes that have high intersection-over-union (IOU) overlap
  with previously selected boxes.  Bounding boxes are supplied as
  [y1, x1, y2, x2], where (y1, x1) and (y2, x2) are the coordinates of any
  diagonal pair of box corners and the coordinates can be provided as normalized
  (i.e., lying in the interval [0, 1]) or absolute.  Note that this algorithm
  is agnostic to where the origin is in the coordinate system.  Note that this
  algorithm is invariant to orthogonal transformations and translations
  of the coordinate system; thus translating or reflections of the coordinate
  system result in the same boxes being selected by the algorithm.

  The output of this operation is a set of integers indexing into the input
  collection of bounding boxes representing the selected boxes.  The bounding
  box coordinates corresponding to the selected indices can then be obtained
  using the `tf.gather operation`.  For example:

    selected_indices = tf.image.non_max_suppression_v2(
        boxes, scores, max_output_size, iou_threshold)
    selected_boxes = tf.gather(boxes, selected_indices)

  Args:
    boxes: A `Tensor` of type `float32`.
      A 2-D float tensor of shape `[num_boxes, 4]`.
    scores: A `Tensor` of type `float32`.
      A 1-D float tensor of shape `[num_boxes]` representing a single
      score corresponding to each box (each row of boxes).
    max_output_size: A `Tensor` of type `int32`.
      A scalar integer tensor representing the maximum number of
      boxes to be selected by non max suppression.
    iou_threshold: A `Tensor` of type `float32`.
      A 0-D float tensor representing the threshold for deciding whether
      boxes overlap too much with respect to IOU.
    name: A name for the operation (optional).

  Returns:
    A `Tensor` of type `int32`.
    A 1-D integer tensor of shape `[M]` representing the selected
    indices from the boxes tensor, where `M <= max_output_size`.
  """
  _ctx = _context.context()
  if _ctx.in_graph_mode():
    _, _, _op = _op_def_lib._apply_op_helper(
        "NonMaxSuppressionV2", boxes=boxes, scores=scores,
        max_output_size=max_output_size, iou_threshold=iou_threshold,
        name=name)
    _result = _op.outputs[:]
    _inputs_flat = _op.inputs
    _attrs = None
  else:
    boxes = _ops.convert_to_tensor(boxes, _dtypes.float32)
    scores = _ops.convert_to_tensor(scores, _dtypes.float32)
    max_output_size = _ops.convert_to_tensor(max_output_size, _dtypes.int32)
    iou_threshold = _ops.convert_to_tensor(iou_threshold, _dtypes.float32)
    _inputs_flat = [boxes, scores, max_output_size, iou_threshold]
    _attrs = None
    _result = _execute.execute(b"NonMaxSuppressionV2", 1, inputs=_inputs_flat,
                               attrs=_attrs, ctx=_ctx, name=name)
  _execute.record_gradient(
      "NonMaxSuppressionV2", _inputs_flat, _attrs, _result, name)
  _result, = _result
  return _result
eshirima
  • 3,837
  • 5
  • 37
  • 61
  • Thanks, that is really helpful. It seems that they call either `execute` or `_apply_op_helper` to run nms. Any idea on how to reach the core algorithm? – mkocabas Feb 21 '18 at 16:26
  • Just do the same thing I did. Open it from your IDE – eshirima Feb 22 '18 at 18:23