2

I'm working with COCO datasets formats and struggle with restoring dataset's format of "segmentation" in annotations from RLE.
Following library is used for converting "segmentation" into RLE - pycocotools

For example dataset contains annotation:

annotation = {
'annotation_path': 'val2017/instances_val2017.json',
'segmentation': [
    [510.66, 423.01, 511.72, 420.03, 510.45, 416.0, 510.34,
     413.02, 510.77, 410.26, 510.77, 407.5, 510.34, 405.16,
     511.51, 402.83, 511.41, 400.49, 510.24, 398.16, 509.39,
     397.31, 504.61, 399.22, 502.17, 399.64, 500.89, 401.66,
     500.47, 402.08, 499.09, 401.87, 495.79, 401.98, 490.59,
     401.77, 488.79, 401.77, 485.39, 398.58, 483.9, 397.31,
     481.56, 396.35, 478.48, 395.93, 476.68, 396.03, 475.4,
     396.77, 473.92, 398.79, 473.28, 399.96, 473.49, 401.87,
     474.56, 403.47, 473.07, 405.59, 473.39, 407.71, 476.68,
     409.41, 479.23, 409.73, 481.56, 410.69, 480.4, 411.85,
     481.35, 414.93, 479.86, 418.65, 477.32, 420.03, 476.04,
     422.58, 479.02, 422.58, 480.29, 423.01, 483.79, 419.93,
     486.66, 416.21, 490.06, 415.57, 492.18, 416.85, 491.65,
     420.24, 492.82, 422.9, 493.56, 424.39, 496.43, 424.6,
     498.02, 423.01, 498.13, 421.31, 497.07, 420.03, 497.07,
     415.15, 496.33, 414.51, 501.1, 411.96, 502.06, 411.32,
     503.02, 415.04, 503.33, 418.12, 501.1, 420.24, 498.98,
     421.63, 500.47, 424.39, 505.03, 423.32, 506.2, 421.31,
     507.69, 419.5, 506.31, 423.32, 510.03, 423.01, 510.45,
     423.01]], 
'area': '702.1057499999998',
'iscrowd': 0,
'image_id': 289343,
'bbox': ['473.07', '395.93', '38.65', '28.67'],
'category_id': 18, 
'id': 1768, 
'height': 640,
'width': 529
}

There is no problem to convert it into Run Length Encoding (RLE) format:

from pycocotools import mask

h = annotation['height']
w = annotation['width']
segmentation = annotation['segmentation']

rle = mask.merge(mask.frPyObjects(segmentation, h, w))
# rle = {
    'size': [640, 529],
    'counts': b'_PX9330cc07O2N1j\\ODhb0=U]OEkb0;T]OFlb0:T]OFlb0:R]OHnb0d02N1O2N2N2M201O1O000006J2N001OEh\\OMYc03l\\OGUc08:0c\\OFTc09m\\OIPc08o\\OIPc09n\\OHQc0d00O1O0012N0O12\\Oi\\O031Vc0KV]O0[j:'
}

Question:
How to convert RLE back to "segmentation"?
So, from:

{
    'size': [640, 529],
    'counts': b'_PX9330cc07O2N1j\\ODhb0=U]OEkb0;T]OFlb0:T]OFlb0:R]OHnb0d02N1O2N2N2M201O1O000006J2N001OEh\\OMYc03l\\OGUc08:0c\\OFTc09m\\OIPc08o\\OIPc09n\\OHQc0d00O1O0012N0O12\\Oi\\O031Vc0KV]O0[j:'
}

retrieve

[
    [510.66, 423.01, 511.72, 420.03, 510.45, 416.0, 510.34,
     413.02, 510.77, 410.26, 510.77, 407.5, 510.34, 405.16,
     511.51, 402.83, 511.41, 400.49, 510.24, 398.16, 509.39,
     397.31, 504.61, 399.22, 502.17, 399.64, 500.89, 401.66,
     500.47, 402.08, 499.09, 401.87, 495.79, 401.98, 490.59,
     401.77, 488.79, 401.77, 485.39, 398.58, 483.9, 397.31,
     481.56, 396.35, 478.48, 395.93, 476.68, 396.03, 475.4,
     396.77, 473.92, 398.79, 473.28, 399.96, 473.49, 401.87,
     474.56, 403.47, 473.07, 405.59, 473.39, 407.71, 476.68,
     409.41, 479.23, 409.73, 481.56, 410.69, 480.4, 411.85,
     481.35, 414.93, 479.86, 418.65, 477.32, 420.03, 476.04,
     422.58, 479.02, 422.58, 480.29, 423.01, 483.79, 419.93,
     486.66, 416.21, 490.06, 415.57, 492.18, 416.85, 491.65,
     420.24, 492.82, 422.9, 493.56, 424.39, 496.43, 424.6,
     498.02, 423.01, 498.13, 421.31, 497.07, 420.03, 497.07,
     415.15, 496.33, 414.51, 501.1, 411.96, 502.06, 411.32,
     503.02, 415.04, 503.33, 418.12, 501.1, 420.24, 498.98,
     421.63, 500.47, 424.39, 505.03, 423.32, 506.2, 421.31,
     507.69, 419.5, 506.31, 423.32, 510.03, 423.01, 510.45,
     423.01]]
Yuriy Leonov
  • 536
  • 1
  • 9
  • 33

1 Answers1

2
# Import mask
from pycocotools import mask

Decode the value into a np.array.

input_value = {
    'size': [640, 529],
    'counts': b'_PX9330cc07O2N1j\\ODhb0=U]OEkb0;T]OFlb0:T]OFlb0:R]OHnb0d02N1O2N2N2M201O1O000006J2N001OEh\\OMYc03l\\OGUc08:0c\\OFTc09m\\OIPc08o\\OIPc09n\\OHQc0d00O1O0012N0O12\\Oi\\O031Vc0KV]O0[j:'
}

decoded_value = mask.decode(input_value)

The decoded_value will be:

array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=uint8)

We can convert it to an image.

from PIL import Image

Image.fromarray(decoded_value * 255)

The output image will be:

The output image.

Sorry that, I haven't found a way to convert it to a list of points, but hope this image might help you. :)

Yishi Guo
  • 309
  • 1
  • 7