I am working on an app with a segmentation portion. We have tried passing the following sizes to VNGeneratePersonSegmentationRequest
:
- 10x10
- 100x100
- 256x256
- 512x512
- 4096x4096
Even after the first (Warmup) request, we are finding performance among the first 4 categories to be almost the same (~60 ms average on iPhone 13), and somehow 4096x4096 is only 2(!) times slower than 512x512.
This is actually an amazing result if you are looking for very high resolution segmentation data, unfortunately, we are trying to see if we can still generate QualityLevel.accurate
masks but save on the performance hit by processing at a lower resolution.
Has anyone had any success optimizing the VNGeneratePersonSegmentationRequest
to run on lower resolution inputs? Possibly I'm missing something in terms of pixel buffer format (using 32BGRA), or input type? Assuming this uses the GPU, is there any way to capture some more of the internals of the neural networks/shaders powering the actual segmentation algorithm?
Thanks, Dennis