I have been experimenting with object detection recently, using Faster R-CNN and YOLOv7 to train models on pre-existing datasets.
Using a UNO card dataset I quite accurately detected the type of UNO cards, based on the symbol in the top left corner. I used an object detection approach, with UNO cards only being categorized into 14 classes.
Based on that, I am wondering what the best approach would be to enhance the model to use for other and more comprehensive card games. Thinking of card games like Munchkin for example, which has 1000s of different cards. For card games like this, object detection might not be the best approach having 1000s of different classes to consider.
The two different approaches I am considering:
Using object detection, create x many classes as there are different playing cards in the game, training the model to detect every single card individually
or
Using object detection, use playing cards to train the model to detect the playing card itself, then using the detected playing card as input for an image classification algorithm
For me there are pros and cons for both methods:
The first approach might be much more accurate, as it detects each card individually. On the other hand, it seems to me that it needs considerably more classes and data to feed into those classes. It also might be difficult to expand the model with more unique cards, as you would have to rerun the model every time.
The second approach might not be as accurate, as it might not only detect playing cards but also identify other objects as playing cards. On the flip side, it seems to me that it is much easier to expand the model with more unique cards.
What might be the best approach here? Do you have a different approach to this, which might be more efficient?