I'm interested in Robot Manipulation, I was reading the paper "PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes" and found the following sentence in the introduction section where it explains the three related tasks of PoseCNN. This is the third task.
The 3D Rotation R is estimated by regressing convolutional features extracted inside the bounding box of the object to a quaternion representation of R.
What is meant by regressing convolutional features to a quaternion representation of Rotation? How to regress to quaternion representation? Can we also use rotation matrix instead of quaternion. Can we say to regress convolutional features to a rotation matrix? If yes what will be the difference between the two?