1

I am trying to implement object picking in OpenGL using C# and OpenTK. I have written a class for this purpose based on two sources:

OpenGL ray casting (picking): account for object's transform

https://www.bfilipek.com/2012/06/select-mouse-opengl.html

Currently my code is only for calculating the distance of the mouse pointer from an arbitrary test coordinate of (0,0,0), but once working it would not take much to iterate through objects in a scene to find a match.

The method is to define a ray underneath the mouse pointer between the near and far clipping planes. Then find the point on that ray which is closest to the point being tested and return the distance between the two. This should be zero when the mouse pointer is directly over (0,0,0) and increase as it moves away in any direction.

Can anyone help troubleshoot this? It executes without errors but the distance being returned clearly isn't correct. I understand the principles but not the finer points of the calculations.

Although I have found various examples online which almost do it, they are generally in a different language or framework and/or use deprecated methods and/or are incomplete or not working.

public class ObjectPicker{

    public static float DistanceFromPoint(Point mouseLocation, Vector3 testPoint, Matrix4 modelView, Matrix4 projection)
    {
        Vector3 near = UnProject(new Vector3(mouseLocation.X, mouseLocation.Y, 0), modelView, projection); // start of ray
        Vector3 far = UnProject(new Vector3(mouseLocation.X, mouseLocation.Y, 1), modelView, projection); // end of ray
        Vector3 pt = ClosestPoint(near, far, testPoint); // find point on ray which is closest to test point
        return Vector3.Distance(pt, testPoint); // return the distance
    }
    private static Vector3 ClosestPoint(Vector3 A, Vector3 B, Vector3 P) 
    {
        Vector3 AB = B - A;
        float ab_square = Vector3.Dot(AB, AB);
        Vector3 AP = P - A;
        float ap_dot_ab = Vector3.Dot(AP, AB);
        // t is a projection param when we project vector AP onto AB 
        float t = ap_dot_ab / ab_square;
        // calculate the closest point 
        Vector3 Q = A + Vector3.Multiply(AB, t); 
        return Q; 
    }   
    private static Vector3 UnProject(Vector3 screen, Matrix4 modelView, Matrix4 projection)
    {
        int[] viewport = new int[4];
        OpenTK.Graphics.OpenGL.GL.GetInteger(OpenTK.Graphics.OpenGL.GetPName.Viewport, viewport);

        Vector4 pos = new Vector4();

        // Map x and y from window coordinates, map to range -1 to 1 
        pos.X = (screen.X - (float)viewport[0]) / (float)viewport[2] * 2.0f - 1.0f;
        pos.Y = 1 - (screen.Y - (float)viewport[1]) / (float)viewport[3] * 2.0f;
        pos.Z = screen.Z * 2.0f - 1.0f;
        pos.W = 1.0f;

        Vector4 pos2 = Vector4.Transform( pos, Matrix4.Invert(modelView) * projection );
        Vector3 pos_out = new Vector3(pos2.X, pos2.Y, pos2.Z);

        return pos_out / pos2.W;
    }
}

It is called like this:

    private void GlControl1_MouseMove(object sender, MouseEventArgs e)
    {

        float dist = ObjectPicker.DistanceFromPoint(new Point(e.X,e.Y), new Vector3(0,0,0), model, projection);
        this.Text = dist.ToString(); // display in window caption for debugging

    }

I know how the matrices are being passed in (as per above code). I'm fairly sure that the contents of those matrices must be correct, since the rendering works fine, and I can rotate/zoom successfully. This is the vertex shader FWIW:

        string vertexShaderSource =
            "# version 330 core\n" +
            "layout(location = 0) in vec3 aPos;" +
            "layout(location = 1) in vec3 aNormal;" +
            "uniform mat4 model;    " +
            "uniform mat4 view;" +
            "uniform mat4 projection;" +
            "out vec3 FragPos;" +
            "out vec3 Normal;" +
            "void main()" +
            "{" +
            "gl_Position = projection * view * model * vec4(aPos, 1.0);" +
            "FragPos = vec3(model * vec4(aPos, 1.0));" +
            "Normal = vec3(model * vec4(aNormal, 1.0))";
            "}";

I use an implementation of Arcball for rotation. Zooming is done using a translation, like this:

private void glControl1_MouseWheel(object sender, System.Windows.Forms.MouseEventArgs e)
    {
        zoom += (float)e.Delta / 240;
        view = Matrix4.CreateTranslation(0.0f, 0.0f, zoom);
        SetMatrix4(Handle, "view", view);
        glControl1.Invalidate();
    }
Rabbid76
  • 202,892
  • 27
  • 131
  • 174
wotnot
  • 261
  • 1
  • 12

2 Answers2

1

Each vertex coordinate is transformed by the model view matrix. This transforms the coordinates from model space to view space. Then each vertex coordinate is transformed by the projection matrix. This transforms from view space to clip space. The perspective divide converts a clip space coordinate to normalized device space.
If you want to convert from normalized device space to model space you have to do the reverse operations. That means you have to transform by the inverse projection matrix and the inverse model view matrix:

Vector4 pos2 = Vector4.Transform(pos, Matrix4.Invert(projection) * Matrix4.Invert(modelView));

respectively

Vector4 pos2 = Vector4.Transform(pos, Matrix4.Invert(modelView * projection));

Note, that OpenTK matrices have to be multiplied from the left to the right. See the answer to OpenGL 4.2 LookAt matrix only works with -z value for eye position.

Rabbid76
  • 202,892
  • 27
  • 131
  • 174
  • Thanks. Code edited accordingly to incorporate your line above. – wotnot May 30 '20 at 20:54
  • OK - I'll change it back. I thought implementing corrections to code was the done thing, but I may be wrong. (My thinking was that it shows other users what is correct, which seems more useful than what is incorrect.) – wotnot May 30 '20 at 21:01
  • Correct, it still doesn't work. I am now getting very small numbers all the time (e.g. of the order of 5E-09, but not constant) – wotnot May 30 '20 at 21:16
  • Yes, that's right. The calculated distance is tiny. – wotnot May 31 '20 at 07:43
  • No, the origin is variable. I can zoom and rotate, and I have got a cube centred on (0,0,0). So I know that when the mouse is over the centre of that cube I should get a distance of zero, increasing as the mouse moves away from the cube. I am currently trying to debug by experimenting with an alternative Unproject function (https://gamedev.stackexchange.com/questions/51820/how-can-i-convert-screen-coordinatess-to-world-coordinates-in-opentk/52975) and looking at unprojected coordinates for the ends of the ray. – wotnot May 31 '20 at 07:56
  • I don't think it is the (only) cause of the problem I am currently having, but my understanding of NDC is that z=-1 at far plane and z=+1 at near plane. The unproject function I am using here is called first with z=0 for near, and then z=+1 for far. Is that wrong, as it currently appears to me? – wotnot May 31 '20 at 08:14
  • @wotnot No -1 is the near plane and +1 is the far plane. The projection matrix mirrors the z axis and turns from a right handed to a left handed system. The normalized device space is a unique cube, with the left bottom near of (-1, -1, -1) and the right top far of (1, 1, 1). The depth range is [0, 1]. 0 is near and far is 1. I don't think that the issue is the algorithm. The bug is somewhere else. Does `viewport`, `modelView` and `projection` contain the correct values? – Rabbid76 May 31 '20 at 08:17
  • Thanks for the explanation RE: NDC values. The way I get `viewport` is shown in the code above. I have just copied it from the original source. I presume it is correct. I will try examining the numbers in the Output window. I'm fairly sure the contents of the matrices are correct since the scene renders correctly. I have appended the vertex shader to my question. – wotnot May 31 '20 at 08:42
  • 1
    `viewport` contains [0, 0, 598, 583] – wotnot May 31 '20 at 08:52
  • Not sure what this means at the moment, but I make some observations from diagnostics. Initial viewing position is vertically downwards from (0,0,-3). I leave that unchanged and apply no rotations. The result of UnProject for z=0 (near plane) is small numbers of the order of (0.01, -0.03, 0.02) typically. The result of UnProject for z=1 (far plane) is understandable in world coordinates e.g. (-40, 40, -100) if I move the mouse pointer to the top left. They are not affected by zooming in and out, whereas I would expect them to be. Zooming is done by translation (code appended to question). – wotnot May 31 '20 at 09:27
  • @wotnot The result seems to be correct. Note, the viewing volume is a [frustum](https://en.wikipedia.org/wiki/Viewing_frustum). The geometry is projected on the viewport. The unit of the coordinates is related to model coordinates rather than pixel. The distances on the near plane are short and the distances on the far plane are long. Now you know, that `UnProject` is working correctly. – Rabbid76 May 31 '20 at 09:33
  • But zooming affects the position of everything in the scene (except one point in the middle) in screen coordinates. As I zoom, surely the unprojected mouse coords should change(?) – wotnot May 31 '20 at 09:46
  • This gets me thinking and I have made a small change to the code. Now I get something which could be the right result. To clarify, is the modelview matrix the product of the model and view matrices?? – wotnot May 31 '20 at 10:01
  • @wotnot Yes of course `modelView = model * view` for OpenTK matrices and `modelView = view * model` for glsl `mat4` – Rabbid76 May 31 '20 at 10:03
  • That seems to be the problem. When calling ObjectPicker.DistanceFromPoint() I was passing in 'model' where is should have been the two combined. I will do some more testing and then accept your answer if it is resolved. I'm very grateful for your help here. Not sure of the protocol - can I then post the corrected code? As per previous comment I would like to leave this for others. – wotnot May 31 '20 at 10:12
1

Answering my own question here so that I can post the working code for the benefit of other users, but at least half the answer was provided by Rabbid76, whose help I am very grateful for.

There were two errors in my original code:

Vector4 pos2 = Vector4.Transform( pos, Matrix4.Invert(modelView) * projection );

where the two matrixes were multiplied in the wrong order, and the projection matrix was not inverted.

float dist = ObjectPicker.DistanceFromPoint(new Point(e.X,e.Y), new Vector3(0,0,0), model, projection);

where I passed in the model matrix not the modelview matrix (which is the product of the model and view matrices).

This works:

private void GlControl1_MouseMove(object sender, MouseEventArgs e)
{

    float dist = ObjectPicker.DistanceFromPoint(new Point(e.X,e.Y), new Vector3(0,0,0), model * view, projection);
    // do something with the result

}

public class ObjectPicker{

    public static float DistanceFromPoint(Point mouseLocation, Vector3 testPoint, Matrix4 modelView, Matrix4 projection)
    {

        int[] viewport = new int[4];
        OpenTK.Graphics.OpenGL.GL.GetInteger(OpenTK.Graphics.OpenGL.GetPName.Viewport, viewport);
        Vector3 near = UnProject(new Vector3(mouseLocation.X, mouseLocation.Y, 0), modelView, projection); // start of ray (near plane)
        Vector3 far = UnProject(new Vector3(mouseLocation.X, mouseLocation.Y, 1), modelView, projection); // end of ray (far plane)
        Vector3 pt = ClosestPoint(near, far, testPoint); // find point on ray which is closest to test point

        return Vector3.Distance(pt, testPoint); // return the distance
    }
    private static Vector3 ClosestPoint(Vector3 A, Vector3 B, Vector3 P) 
    {
        Vector3 AB = B - A;
        float ab_square = Vector3.Dot(AB, AB);
        Vector3 AP = P - A;
        float ap_dot_ab = Vector3.Dot(AP, AB);
        // t is a projection param when we project vector AP onto AB 
        float t = ap_dot_ab / ab_square;
        // calculate the closest point 
        Vector3 Q = A + Vector3.Multiply(AB, t); 
        return Q; 
    }   
    private static Vector3 UnProject(Vector3 screen, Matrix4 modelView, Matrix4 projection)
    {
        int[] viewport = new int[4];
        OpenTK.Graphics.OpenGL.GL.GetInteger(OpenTK.Graphics.OpenGL.GetPName.Viewport, viewport);

        Vector4 pos = new Vector4();

        // Map x and y from window coordinates, map to range -1 to 1 
        pos.X = (screen.X - (float)viewport[0]) / (float)viewport[2] * 2.0f - 1.0f;
        pos.Y = 1 - (screen.Y - (float)viewport[1]) / (float)viewport[3] * 2.0f;
        pos.Z = screen.Z * 2.0f - 1.0f;
        pos.W = 1.0f;

        Vector4 pos2 = Vector4.Transform( pos, Matrix4.Invert(projection) * Matrix4.Invert(modelView) );
        Vector3 pos_out = new Vector3(pos2.X, pos2.Y, pos2.Z);

        return pos_out / pos2.W;
    }

}

Since posting this question I have learned that the method is generally called ray casting, and have found a couple of excellent explanations of it:

Mouse Picking with Ray Casting by Anton Gerdelan

OpenGL 3D Game Tutorial 29: Mouse Picking by ThinMatrix

wotnot
  • 261
  • 1
  • 12