How can you constrain a neural network layer to simply be a n-dimensional rotation layer?

Question

I'm looking to constrain one layer of my neural network to specifically find the best rotation of its input in order to satisfy an objective. (My end goal, where R is the rotation layer, is of the form R.transpose() @ f(R @ z)).

I am looking to train this (+ other components) via gradient descent. If z is just two dimensional, then I can just say

R = [ cos(theta)   -sin(theta)
      sin(theta)    cos(theta)]

and have theta be a learnable parameter. However, I am lost on how to actually set this up for a d-dimensional space (where d>10). I've tried looking at resources on how to make a d-dimensional rotation matrix and it gets heavy into Linear Algebra and is way over my head. It feels like this should be easier than it seems, so I feel like I'm overlooking something (like maybe R should just be a usual linear layer without any non-linear activations).

Anyone have any ideas? I appreciate you, in advance : )

a cross-site post: https://stats.stackexchange.com/q/546220/144441 — OmG, Sep 27 '21 at 23:44
Sorry yeah I wasn't sure if this would fit into StackOverflow or cross-validated so I posted both places. I believe the cross-validated answer is good, so should I copy it here (with credit), or should I just delete this post? — Sean K, Sep 28 '21 at 15:32

score 1 · Answer 1 · answered Sep 28 '21 at 16:35

QR decomposition can help with this (since Q is orthogonal) via having W be an unconstrained learnable matrix (without a bias term) and solve W = QR, and then actually use Q as your orthonormal. If you use the pytorch QR then backprop will be able to get back from the QR decomposition and update W.

How can you constrain a neural network layer to simply be a n-dimensional rotation layer?

1 Answers1