2

I've read a lot online before asking this question, and I came to understand that the View matrix is just the inverse of the Camera Transformation matrix. For the sake of clarity, if we treat the Camera as an actual entity that is transformed like any other 3D object in the scene (so with a transformation matrix, that first translate, then rotate and then scale the object) we obtain the camera transformation matrix which contains the position of the camera. If we invert this matrix we should obtain the view matrix, but this is not what happens in my code. I have two static methods: one that creates the transformation matrix given the position, the rotation of the 3 axes and one value of scaling that is applied to all the axes (first translate, then rotate, then scale) and another one that creates the view matrix given the Camera which has a yaw (rotation of the y axis), a pitch (rotation around the x axis) and a Vec3 which represents the position (in here we first rotate the camera and then translate it with its negative position, because moving the camera is the same as moving the world around it). Here is the code for the transformation matrix:

public static Matrix4f createTransformationMatrix(Vector3f translation, float rx, float ry,
        float rz, float scale) {
    
    Matrix4f matrix = new Matrix4f();
    matrix.setIdentity();
    
    Matrix4f.translate(translation, matrix, matrix);
    
    Matrix4f.rotate((float)Math.toRadians(rx), new Vector3f(1, 0, 0), matrix, matrix);
    Matrix4f.rotate((float)Math.toRadians(ry), new Vector3f(0, 1, 0), matrix, matrix);
    Matrix4f.rotate((float)Math.toRadians(rz), new Vector3f(0, 0, 1), matrix, matrix);
    
    Matrix4f.scale(new Vector3f(scale, scale, scale), matrix, matrix);
    
    return matrix;
}

Here is the code for the view Matrix:

public static Matrix4f createViewMatrix(Camera camera) {
    Matrix4f viewMatrix = new Matrix4f();
    viewMatrix.setIdentity();
    
    Matrix4f.rotate((float) Math.toRadians(camera.getPitch()), new Vector3f(1, 0, 0), viewMatrix, viewMatrix);
    Matrix4f.rotate((float) Math.toRadians(camera.getYaw()), new Vector3f(0, 1, 0), viewMatrix, viewMatrix);
    
    Vector3f cameraPos = camera.getPosition();

    Vector3f negativeCameraPos = new Vector3f(-cameraPos.x, -cameraPos.y, -cameraPos.z);
    Matrix4f.translate(negativeCameraPos, viewMatrix, viewMatrix);
    
    return viewMatrix;
}

Here comes the problem: since I followed a tutorial on Youtube on how to build these two matrices and I did not write this code myself, I don't understand how is the viewMatrix an inversion of the camera transformation matrix. I've noticed that in createViewMatrix() we first rotate and then translate (with the negative position) while in createTransformationMatrix() we first translate then rotate then scale. So if I understand things correctly, I can create a transformation matrix with the Camera data and then invert it to obtain the View matrix, but it doesn't work. I also tried, in createViewMatrix(), to first translate with the positive position (without computing the negativeCameraPos) then rotate and then invert the matrix. Same result: it doesn't work, weird things happen when I run the program (I don't know how to explain them, but they're just wrong). I tried a lot of other things, but it only works with the code I provided. Can you explain me how first rotating and then translating with the negative camera position provides the inverted camera transformation matrix please? I'm so sorry for the prolixity, but I want you to understand my problem at the first shot so that you can answer me. Thank you.

1 Answers1

0

Your basic understand of the camera and view matrix is correct. The camera is normally used to describe the position and orientation of the viewer/camera in the world while the view matrix would be used to transform from world space to view space so it should be the inverse of the camera matrix.

Note that in matrix math there is a difference in the order in which transformations are applied: rotating and then translating is different from translating and then rotating (we'll leave scaling out of the equation here since you normally don't scale a camera - zooming would be done via the projection matrix).

When building your camera matrix you'd first rotate to set camera orientation and then translate to set camera position, i.e. you treat the camera as sitting at 0/0/0, looking along the z axis (so the view vector would be 0/0/1). After rotating you get a different normalized view vector but the camera would still "sit" at 0/0/0. Then you translate to the actual camera position (you might need additional matrix operations to calculate that position but I'd do that in a separate step for starters - until you get things right).

Can you explain me how first rotating and then translating with the negative camera position provides the inverted camera transformation matrix please?

It shouldn't as the resulting view matrix would apply a different direction. The "negative" rotation (i.e. angle +/- 180 degrees) should work though. In that case you rotate a vector to point to the camera (so if the camera turns 45 degrees around the y-axis any object "pointing to the camera" would need to rotate by 225 or -135 degrees around the same axis).

Negative translation is ok since if you move the camera to 4/3/2 in world space a translation by -4/-3/-2 would move any coordinate in world space into view space.

Thomas
  • 87,414
  • 12
  • 119
  • 157
  • Thank you, but I still need a further clarification: when people say "(view matrix is the camera trasformation matrix)^-1" are they correct? If so, can I create this so called camera transformation matrix by the createTransformationMatrix() method in my code using the camera data, and then, in order to obtain the viewMatrix, I invert it? – Davide Pasero Mar 23 '21 at 09:55
  • @DavidePasero basically yes. Imagine using a real camera object in your world: you'd position and rotate it like any other object. So inverting any matrix that transforms that object from local space to world space should result in matrix that transforms from world space to local camera space. (Note that if your camera is attached to another object you might have to apply/combine multiple matrices to get from camera to world space but once you have a matrix that represents all those in one a single inversion should be enough). – Thomas Mar 23 '21 at 10:08
  • Yes, I do understand that, but if I try to create the camera matrix via createTransformationMatrix(camera_data) and then I invert it I get a different result than when I use createViewMatrix(), how is that possibile? – Davide Pasero Mar 23 '21 at 10:36
  • @DavidePasero well, for one you are using more angles and add scaling but assuming the values result in "identity" operations (`rz = 0.0` and `scale = 1.0`) the matrices should be almost the same (values may differ slightly due to precision issues). The bigger difference is that `createTransformationMatrix()` translates _first_ and then rotates while `createViewMatrix()` rotates first - as I explained the order is relevant (translating first would mean any subsequent rotation also rotates the position). – Thomas Mar 23 '21 at 10:40
  • @DavidePasero you might want to take another look into matrix multiplication rules. Note that `rotate()`, `translate()` and `scale()` basically result in matrix multiplications. The implementation might take some shortcuts but the results should be the same as creating a translation, rotation or scaling matrix and multiplying it into the current one. – Thomas Mar 23 '21 at 10:52
  • Oh okay, now I think I get it, so the difference between Camera transformation matrix and the View matrix is not only that the view matrix is the inverted camera matrix, but also, in the view matrix, we first rotate and then translate, while in the camera matrix we first translate and then rotate, right? (I tried to print at each frame the values of the two matrices both the inverted camera matrix and the view matrix but except for a few cases they have very different numbers). – Davide Pasero Mar 23 '21 at 11:01
  • Edit: after taking a closer look I've noticed that they are pretty much the same values but in different places (except for the translation values which are still pretty different) – Davide Pasero Mar 23 '21 at 11:05
  • @DavidePasero inversion of a matrix is not exactly the same as applying the same operations in reverse order and/or with negative values. Since matrices can become very complex I'd suggest you settle for creating one matrix in a way you can handle (e.g. camera to world) and then apply a proper matrix inversion to get the other direction. – Thomas Mar 23 '21 at 11:08
  • Okay, so you are basically saing that what my method createViewMatrix() is doing is using a sort of shortcut in order to build the viewMatrix? And if I build the camera to world matrix like one matrix_coefficient at a time (m00, m01... m33) and then I invert it I get the viewMatrix? – Davide Pasero Mar 23 '21 at 11:27
  • @DavidePasero no I'm saying that your 2 methods are doing someting similar but not quite the inverse operation. You'd also not build the matrix one coefficient at a time but to make it easier one operation at a time (translation, rotation, scaling). Then you invert the matrix to get the view matrix. I'll repeat again: the order of operations is important and reversing the order _does not_ result in an inverted matrix. – Thomas Mar 23 '21 at 12:21
  • But still, you told me before that when people say "view matrix is the (camera trasformation matrix)^-1" they're correct, but when I try to create the view matrix by using the createTransformationMatrix() method and then invert it I get a different result than just using createViewMatrix() method. You told me that it was due to the fact that in createViewMatrix() I first rotate and then translate while in createTransformationMatrix() I do the opposite. I'll ask again, I'm sorry but i need to understand, is the viewMatrix just the inverse of the camera transformation matrix or something else? – Davide Pasero Mar 23 '21 at 13:27
  • @DavidePasero you might try to re-read what I've posted so far. The transformation matrix would transform from "object space" to "world space" and in the case of your camera "object space" == "camera space". However, you need the inverse operation, i.e. transform everything from "world space" to camera "space" so you need the inverted transformation matrix. Your `createViewMatrix()` is doing something else altogether and very likely not what you want. – Thomas Mar 23 '21 at 14:32
  • The problem is that my `createViewMatrix()` does exactly what I want, in fact, when I run my real time rendering engine using `createViewMatrix()` everything is totally correct: the camera behaves as it should, trees are trees and houses are houses. I understand the concepts of local-to-world matrix and world-to-local matrix and I understand that matrix multiplication order matters. But the code I provided works perfectly, that's the problem ;) . – Davide Pasero Mar 23 '21 at 14:41
  • Hmm, houses should still be houses in any case :) - The matrices by themselves aren't wrong at all and if your rotation is applied differently to the camera (e.g. by using a different rotation direction) it might still work as expected. There are more pieces to the equation (e.g. how both matrices are applied to vectors) so it's hard to tell. – Thomas Mar 23 '21 at 15:15
  • Yes you're right, the models remain the same because they're not affected, what I meant was that the camera behaves as it should: if I move my mouse to the right the yaw increases and it seems that the camera is turning right, same if I go up or down or left, If i press W my camera goes straight on where it should and so on (i use the WASD system). In the vertex shader the position of the vertex on the screen is computed in this way: position = projectionMatrix * viewMatrix * transformationMatrix * vec4(position.x, position.y, position.z, 1.0); where the vec4(...) is the vertex in object space – Davide Pasero Mar 23 '21 at 15:25
  • And, of course, when I tried to apply the inverted camera transformation matrix I applied it the same way as I did with the viewMatrix (I literally assigned to my variable called "viewMatrix" the inverted camera transformation matrix and changed nothing else). – Davide Pasero Mar 23 '21 at 15:29
  • Well, here's a difference: your transformation matrices rotate your object around their local axes but translate them to their world position. Your camera, however, would need to move along its rotated vectors so I assume your camera position is actually changed in camera space. – Thomas Mar 23 '21 at 15:36
  • yes, that's true, but haven't we already established that "the order of multiplication matters"? I know that the main difference between `createViewMatrix()` and `createTransformationMatrix()` is that in the former we first rotate and then translate (and I know that it should be read from right to left so it's the opposite but I'm following the code) and in the latter we first translate and then rotate, but still my question remain unanswered. How do I consider my inverted camera transformation matrix in respect to my viewMatrix? Sorry if I'm being tedious and thanks a lot for the help – Davide Pasero Mar 23 '21 at 16:04
  • @DavidePasero I made some tests because I became unsure and I think I now realized what was bugging me: to get `createViewMatrix()` to create the inverse of `createTransformationMatrix()` you need to use all values in the negative: negative translation _and_ negative angles (so `pitch` would be `-rx` etc.). Maybe that's true already but the posted code doesn't show that. Even then you'd need to first do the yaw and then the pitch rotation. Doing it the other way round probably still works and visual checks might not make you realize the positions are somewhat off. – Thomas Mar 24 '21 at 06:00
  • I just tested your way and works perfectly. I created a new function based on `createTransformationMatrix()` which does first the yaw and then the pitch and when I calleded I gave it **only** the negative angle values and it works just fine (I didn't change the translation values neither when I called the function nor when I applied the translation). In conclusion, can we say that the camera transformation matrix is something a bit different than a transformation matrix of any other 3D object (hence the viewMatrix is the inverse of this "different" transformation matrix)? – Davide Pasero Mar 24 '21 at 08:00
  • @DavidePasero well, the camera matrix should be the same as for any other object in your world. However, it probably feels different because any rotation and movement applied to it are done from the perspective of the camera and not from that of an outside viewer. That being said, if the camera matrix somehow transforms objects to actually be "behind" the camera even if they should be in front of it the projection matrix might still "correct" that (or just ignore it in case of a parallel projection). – Thomas Mar 24 '21 at 08:18
  • Ok perfect, you solved my big problem, I know understand the viewMatrix a bit more and that was actually my purpose. Thank you – Davide Pasero Mar 24 '21 at 10:07
  • (Even if I still don't understand how my `createViewMatrix()` works and why, in order to build the viewMatrix from `createTransformationMatrix()` I should first do the yaw and then the pitch. I suppose it's due to the fact that I am in the perspection of the camera, so things work differently, but I still don't understand **HOW** differently they work) – Davide Pasero Mar 24 '21 at 10:21