I have coco style annotations (json format) with Both segmentations And bboxes.
Most of the segmentations are given as list-of-lists of the pixels (polygon).
The problem is that some segmentations are given as a dictionary (with 'counts' and 'size' keys) that represent RLE values, and in these cases the 'iscrowd' key is equal to 1 (normally it is equal to 0).
I would like to convert all the 'annotations' with iscrowd==1 to be represented as polygons instead of RLE.
I do not need the mask as suggested here, but just the json file to have only polygon shaped segmentations.
Here is an example of a few annotations (from the same image), note how in the first two the segmentation is in polygon shape, and the latter two it is in RLE shape:
{'id': 53, 'image_id': 4, 'category_id': 2037037930, 'segmentation': [[344.51, 328.83, 316.02, 399.73, 358.3, 399.78, 375.85, 336.07]], 'area': 2561.4049499999965, 'bbox': [316.02, 328.83, 59.83000000000004, 70.94999999999999], 'iscrowd': 0, 'extra': {}}
{'id': 54, 'image_id': 4, 'category_id': 2037037930, 'segmentation': [[376.43, 233.52, 368.93, 250.71, 375.96, 252.89, 369.4, 269.76, 378.62, 273.83, 372.21, 292.42, 400.09, 302.34, 400.09, 302.11, 400.1, 242.04]], 'area': 1596.5407000000123, 'bbox': [368.93, 233.52, 31.170000000000016, 68.81999999999996], 'iscrowd': 0, 'extra': {}}
{'id': 67, 'image_id': 4, 'category_id': 2037037930, 'segmentation': {'counts': [55026, 2, 396, 4, 394, 7, 391, 9, 389, 12, 386, 14, 384, 17, 381, 19, 379, 21, 377, 24, 374, 26, 372, 29, 369, 31, 367, 33, 365, 36, 362, 38, 360, 41, 357, 43, 355, 46, 352, 48, 350, 50, 348, 53, 345, 55, 343, 58, 340, 38, 1, 21, 338, 37, 5, 21, 335, 37, 7, 21, 335, 34, 10, 19, 338, 32, 12, 16, 340, 33, 11, 14, 342, 33, 11, 11, 346, 33, 11, 8, 348, 33, 10, 7, 350, 33, 8, 8, 351, 34, 5, 11, 351, 33, 3, 13, 351, 49, 351, 49, 352, 49, 351, 49, 351, 49, 352, 48, 352, 49, 351, 49, 352, 46, 354, 44, 356, 41, 359, 39, 362, 36, 364, 35, 365, 35, 366, 35, 365, 35, 365, 35, 366, 34, 366, 34, 366, 35, 366, 34, 366, 34, 366, 32, 368, 29, 372, 25, 375, 23, 377, 20, 381, 18, 382, 19, 381, 19, 382, 18, 382, 18, 382, 19, 382, 18, 382, 18, 382, 19, 381, 19, 382, 16, 384, 13, 387, 9, 392, 5, 395, 2, 73808], 'size': [400, 400]}, 'area': 2598.0, 'bbox': [137, 174, 79, 65], 'iscrowd': 1, 'extra': {}}
{'id': 68, 'image_id': 4, 'category_id': 2037037930, 'segmentation': {'counts': [76703, 2, 396, 4, 394, 7, 391, 9, 389, 11, 387, 14, 384, 16, 382, 19, 379, 21, 377, 23, 375, 26, 372, 28, 370, 30, 368, 33, 365, 35, 364, 37, 363, 37, 364, 36, 364, 37, 364, 36, 364, 36, 364, 37, 364, 36, 364, 37, 363, 37, 364, 36, 364, 37, 364, 36, 364, 36, 364, 37, 364, 15, 1, 20, 364, 13, 4, 19, 365, 10, 6, 20, 363, 9, 8, 20, 361, 9, 11, 20, 358, 9, 13, 20, 356, 11, 14, 19, 354, 14, 13, 20, 351, 16, 13, 20, 348, 20, 13, 19, 346, 22, 13, 20, 343, 24, 13, 20, 341, 27, 13, 20, 338, 29, 13, 20, 336, 32, 13, 19, 334, 34, 13, 20, 331, 37, 12, 20, 331, 37, 13, 19, 332, 36, 12, 21, 331, 37, 8, 24, 332, 36, 5, 28, 331, 37, 1, 31, 331, 69, 332, 69, 331, 69, 332, 68, 332, 69, 331, 69, 332, 68, 332, 69, 332, 68, 332, 69, 331, 69, 332, 68, 332, 48, 1, 20, 331, 45, 5, 19, 332, 41, 8, 19, 332, 38, 12, 19, 332, 36, 13, 19, 332, 37, 12, 20, 331, 37, 13, 19, 332, 36, 13, 19, 332, 37, 13, 19, 332, 36, 13, 19, 332, 37, 12, 19, 332, 37, 13, 19, 332, 36, 13, 19, 332, 37, 13, 19, 332, 36, 12, 20, 332, 36, 10, 22, 332, 37, 6, 26, 332, 36, 4, 28, 332, 37, 1, 28, 335, 63, 337, 61, 339, 59, 342, 56, 344, 53, 348, 50, 350, 48, 352, 46, 355, 43, 357, 40, 360, 38, 363, 35, 365, 33, 368, 30, 370, 28, 372, 25, 376, 22, 378, 20, 381, 17, 383, 15, 385, 12, 389, 9, 391, 7, 394, 4, 396, 2, 40521], 'size': [400, 400]}, 'area': 4551.0, 'bbox': [191, 253, 108, 82], 'iscrowd': 1, 'extra': {}}
Failed test 1:
I already tried the following:
for annotation in coco_data['annotations']:
if type(annotation['segmentation']) == dict:
# Get the values of the dictionary
height = annotation['segmentation']['size'][0]
width = annotation['segmentation']['size'][1]
counts = annotation['segmentation']['counts']
# Decode the RLE encoded counts
rle = np.array(counts).reshape(-1, 2)
starts, lengths = rle[:, 0], rle[:, 1]
starts -= 1
ends = starts + lengths
pixels = []
for lo, hi in zip(starts, ends):
pixels.extend(range(lo, hi))
pixels = np.array(pixels)
# Convert the 1D pixels array to a 2D array
segments = np.zeros((height, width), dtype=np.uint8)
segments[pixels // width, pixels % width] = 1
segments = np.where(segments == 1)
# Update the segmentation and iscrowd fields
annotation['segmentation'] = [segments[1].tolist(), segments[0].tolist()]
annotation['iscrowd'] = 0
But got the following error:
ValueError Traceback (most recent call last)
<ipython-input-29-1bf7f4af292c> in <module>
16
17 # Decode the RLE encoded counts
---> 18 rle = np.array(counts).reshape(-1, 2)
19 starts, lengths = rle[:, 0], rle[:, 1]
20 starts -= 1
ValueError: cannot reshape array of size 183 into shape (2)
afaik, it expectes RLE to be an even length? not sure where is the problem and how to solve it.
Failed test 2:
then i tried something a bit different with import pycocotools.mask as mask
and import skimage.measure as measure
and the following function:
def rle_to_polygon(rle, height, width):
if isinstance(rle, list):
rle = mask.frPyObjects(rle, height, width)
rle = mask.decode(rle)
contours = measure.find_contours(rle, 0.5)
polygon = []
for contour in contours:
contour = np.fliplr(contour) - 1
contour = contour.clip(min=0)
contour = contour.astype(int)
if len(contour) >= 4:
polygon.append(contour.tolist())
return polygon
I receive
<ipython-input-43-84d17a601509> in rle_to_polygon(rle, height, width)
79 def rle_to_polygon(rle, height, width):
80 if isinstance(rle, list):
---> 81 rle = mask.frPyObjects(rle, height, width)
82 rle = mask.decode(rle)
83 contours = measure.find_contours(rle, 0.5)
pycocotools/_mask.pyx in pycocotools._mask.frPyObjects()
TypeError: object of type 'int' has no len()
Any suggestions would be highly appreciated!