In Apple's tutorial on performing computer vision tasks within ARKit they note:
Important
Making sure only one buffer is being processed at a time ensures good performance. The camera recycles a finite pool of pixel buffers, so retaining too many buffers for processing could starve the camera and shut down the capture session. Passing multiple buffers to Vision for processing would slow down processing of each image, adding latency and reducing the amount of CPU and GPU overhead for rendering AR visualizations.
In Swift, I've done exactly what they suggest you should not do, retained multiple buffers in a queue for processing in another thread. Sure enough, ARKit performance suffers. ARKit frames only display as quickly as I dequeue buffers from my queue. I'd like to better understand the mechanics behind this.
How does ARKit know that the buffer is being retained? Is there some sort of locking mechanism in swift?