1

I am in the process of creating a program that gets input images of pictures of dices, and it has to tell the value of the throw. In other words, the sum of the points on the top sides of the dices. For example, this is one of my example images:

enter image description here

For this, I would like to receive the number 4 as the result.

Here is what I have accomplished so far:

  1. Segmenting the individual dices. This step is working quite well.
  2. Identifying each point on the individual dices. This is working okay, could be better.
  3. Group the points on a dice, based on what side they are on. I'm stuck here. From this it would be easy to find the top side and thus the value of the throw.

For example, I tried to group the points with K-means algorithm, but it groups points together that are not on the same side, very often. Here is what the picture looks like after finding the points (marked with blue) and running K-means (group center points marked with green):

enter image description here

Besides K-means I tried another clustering algorithm, also with no success (this was a hierarchical clustering). I have also checked this question, which is about the same problem basically. However, the answers are only applicable to the image OP posted, and cannot be at all generalized.

I'm asking how I could group the points marked with blue dots, based on which side of the dice they are on. I'd love to group them something like this:

enter image description here

Not for this one image, I have a great many of other similar images of dices, some of the from different angles, so methods that use point circularity or size won't really work. Thank you for any help.

Gtomika
  • 845
  • 7
  • 24
  • What you're asking is hugely dependent on camera angle and lighting. It takes relatively complicated heuristics to determine which face is "up". Here, we might use the brightness of the faces to decide that, but only because of the lighting in this shot. If you had images taken from directly above, it would be much easier. – Tim Roberts Nov 07 '21 at 18:57
  • Unfortunately the images are mostly from angles like this, and not directly from above. – Gtomika Nov 07 '21 at 19:01
  • You could try to fit lines to pairs of points and count inliers and outliers. Keep the line with most inliers, group them, snd repeat until all point lines are found. – Micka Nov 07 '21 at 19:02
  • 1
    Then I think your first step is to use brightness cues to determine what region likely represents the "up" face. Then you can search for pips in that region. – Tim Roberts Nov 07 '21 at 19:02
  • Since all sides are planes, homographies seem to be reasonable (finding up to 3 homographies) but there arent enough circles, so you would have to use the contours of the points as well. – Micka Nov 07 '21 at 19:04
  • train a neural network to predict the angles of the die from the picture. it'll learn the arrangement of the six sides and the number and positions of the pips on each side. the predicted angles directly determine the view. -- I concur, anything low level just won't solve the problem. certainly no "pixel counting" or clustering or homographies or anything like that. – Christoph Rackwitz Nov 07 '21 at 20:20
  • Once you know which points belong to each other, the problem is quite easily solved with solvePnp (or homographies and their vanishing points again) if you know the general point-geometry (which planes are neighbors) of the dices (e.g. that 6 is on the opposite side of 1 and 2, 3 and 6 share a corner. However, DNNs will give a good and robust, but expensive (development and runtime) solution to the problem. – Micka Nov 07 '21 at 20:32

0 Answers0