Structuring keypoints as an input for a neural network

Question

Background

I have a neural network that outputs key points for pose (feet, ankles, knees, arms, head, etc.) and the connections - basically I've got a skeleton. I'd like to use these key points /skeleton as an input to another neural network - a relation network (https://arxiv.org/pdf/1706.01427.pdf). The goal is to learn relationships between pose and different objects.

Question

Since I'm working with key points, I'm not sure what the best way to structure them is as an input. I've considered converting the key points to an image where at every X/Y location the value is 0 unless it's covered by the skeleton where the value is set to 1. But that seems inefficient. Is there a way to retain the structural benefits of using images (for which I can use convolutional nets), without the hit on performance?

If your "objects" are images, I believe the best is to make those skeletons images too.... But if you've got keypoints to objects, maybe you could try to work only with keypoints.... — Daniel Möller, Sep 21 '17 at 16:58
The objects are keypoints as well. What does working only with keypoints mean though? Does it mean (1) Only the X,Y coordinates of the keypoints, (2) The X,Y coordinates of the keypoints and every position on the line connecting them, or something else? Also, would you use images with every other coordinate set to 0 or literally just use the keypoints? — megashigger, Sep 21 '17 at 17:17
I don't have a ready solution.... but I would try to experiment on that... maybe you should define lines (pairs of points, if you don't have surfaces). But indeed, you would need to make something about unused points. — Daniel Möller, Sep 21 '17 at 17:23

score 0 · Answer 1 · answered Sep 21 '17 at 17:53

You should go with your proposal to store store them in a HxW tensor (or lets call it image) as you will have access to much more tools when working with "images".

Depending on your performance needs and the amount of key points you could also consider Sparse Tensors which only store the values not equal to 0, however you should check if your required ops are fully supported by the special sparse tensor ops.

Structuring keypoints as an input for a neural network

1 Answers1