0

I'm interested in Robot Manipulation, I was reading the paper "PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes" and found the following sentence in the introduction section where it explains the three related tasks of PoseCNN. This is the third task.

The 3D Rotation R is estimated by regressing convolutional features extracted inside the bounding box of the object to a quaternion representation of R.

What is meant by regressing convolutional features to a quaternion representation of Rotation? How to regress to quaternion representation? Can we also use rotation matrix instead of quaternion. Can we say to regress convolutional features to a rotation matrix? If yes what will be the difference between the two?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
ML Dev
  • 69
  • 1
  • 7

1 Answers1

1

"regressing convolutional features" means that you use the features extracted by the network for predicting some numbers.

In your case you are trying to predict the numbers of a quaternions which represent a rotation matrix.

I think the reason they are regressing a quaternions and not a rotation matrix is because it they are more compact, more numerically stable, and more efficient. For more information on the differences look at Quaternions and spatial rotation

Also i think you could try to regress the rotation matrix directly, if you look at the loss they use for the regression of the quaternions you see they convert the quaternions to there rotation matrix representation. So the loss itself is on the rotation matrix and not directly on the quaternions

Community
  • 1
  • 1
Amitay Nachmani
  • 3,259
  • 1
  • 18
  • 21