I see COCO2017 has 80 classes 118k training and 5k validation dataset(122k images). I have a question here. Does the number of images per classes(1525 images per class) which is ~ 122k / 80?
Asked
Active
Viewed 2,015 times
1 Answers
2
the COCO dataset is not an evenly distributed dataset, i.e., all the classes do not have the same number of images. So, let me show you a way to find out the number of images in any class you wish.
I am using the PyCoco API to work with the COCO dataset. Let's find out the number of images in the 'person' class of the COCO dataset. Here is a code gist to filter out any class from the COCO dataset:
# Define the class (out of the 80 COCO classes)
filterClasses = ['person']
# Fetch class IDs only corresponding to the filterClasses
catIds = coco.getCatIds(catNms=filterClasses)
# Get all images containing the above Category IDs
imgIds = coco.getImgIds(catIds=catIds)
print("Number of images containing the class:", len(imgIds))
There, we get the number of images corresponding to 'person' in the dataset!
I have recently written an entire post on exploring and manipulating the COCO dataset. Do have a look to get more details and the entire code.

Viraf
- 121
- 3