EDIT: seems like this is a bug in Coremltools. See here: https://github.com/apple/coremltools/issues/691#event-3295066804
one-sentence summary: the non-maximum suppression layer outputs four values, but I am only able to use two of those four values as inputs to subsequent layers.
I'm building a neural network model using Apple's Core ML Tools package, and I've added a non-maximum suppression (NMS) layer. This layer has four outputs, per the documentation:
- output 1: box coordinates, corresponding to the surviving boxes.
- output 2: box scores, corresponding to the surviving boxes.
- output 3: indices of the surviving boxes. [this is the output I want to use in a subsequent layer]
- output 4: number of boxes selected after the NMS algorithm
I've set up a test model with some simple inputs, and the NMS layer correctly returns the expected values for each of the four outputs listed above. If I then use either output#1 or output#2 in a subsequent layer, I have no problems doing so. However, if I try to use output#3 or output#4 in a subsequent layer, I get an error saying that the input was "not found in any of the outputs of the preceding layers."
Jupyter Notebook
The easiest way to replicate the issue is to download and go through this Jupyter Notebook.
Alternatively, the sample code below also walks through the problem I'm having.
Sample Code
I've set up some sample input data as follows:
boxes = np.array(
[[[0.00842474, 0.83051298, 0.91371644, 0.55096077],
[0.34679857, 0.31710117, 0.62449838, 0.70386912],
[0.08059154, 0.74079195, 0.61650205, 0.28471152]]], np.float32)
scores = np.array(
[[[0.87390688],
[0.2797731 ],
[0.72611251]]], np.float32)
input_features = [('boxes', datatypes.Array(*boxes.shape)),
('scores', datatypes.Array(*scores.shape))]
output_features = [('output', None)]
data = {'boxes': boxes,
'scores': scores }
And I've built the test model as follows:
builder = neural_network.NeuralNetworkBuilder(input_features, output_features, disable_rank5_shape_mapping=True)
# delete the original output, which was just a placeholder while initializing the model
del builder.spec.description.output[0]
# add the NMS layer
builder.add_nms(
name="nms",
input_names=["boxes","scores"],
output_names=["nms_coords", "nms_scores", "surviving_indices", "box_count"],
iou_threshold=0.5,
score_threshold=0.5,
max_boxes=3,
per_class_suppression=False)
# make the model output all four of the NMS layer's outputs
for name in ["nms_coords", "nms_scores", "surviving_indices", "box_count"]:
builder.spec.description.output.add()
builder.spec.description.output[-1].name = name
builder.spec.description.output[-1].type.multiArrayType.dataType = ft.ArrayFeatureType.FLOAT32
# # add a linear activation layer (intentionally commented out)
# builder.add_activation(
# name="identity",
# non_linearity="LINEAR",
# input_name="surviving_indices",
# output_name="activation_output",
# params=[1.0,0.0])
# initialize the model
model = coremltools.models.MLModel(builder.spec)
At this point, when I call
output = model.predict(data, useCPUOnly=True)
print(output)
everything works as expected.
For example, if I call print(output['surviving_indices']
, the result is [[ 0. 2. -1.]]
(as expected).
However, I'm unable to use either the surviving_indices
or the box_count
outputs as an input to a subsequent layer. For example, if I add a linear activation layer (by uncommenting the block above), I'll get the following error:
Input 'surviving_indices' of layer 'identity' not found in any of the outputs of the preceding layers.
I have the same problem if I use the box_count
output (i.e., by setting input_name="box_count"
in the add_activation
layer).
I know that the NMS layer is correctly returning outputs named surviving_indices
and box_count
, because when I call print(builder.spec.neuralNetwork.layers[0])
I get the following result:
name: "nms"
input: "boxes"
input: "scores"
output: "nms_coords"
output: "nms_scores"
output: "surviving_indices"
output: "box_count"
NonMaximumSuppression {
iouThreshold: 0.5
scoreThreshold: 0.5
maxBoxes: 3
}
Furthermore, if I use either the nms_coords
or the nms_scores
outputs as an input to the linear activation layer, I don't have any problems. It will correctly output the unmodified nms_coords
or nms_scores
values from the NMS layer's output.
I'm baffled as to why the NMS layer correctly outputs what I want it to, but then I'm unable to use those outputs in subsequent layers.