I am receiving YUV 420 CMSampleBuffer
s of the screen in my System Broadcast Extension, however when I attempt to access the underlying bytes, I get inconsistent results: artefacts that are a mixture of (it seems) past and future frames. I am accessing the bytes in order to rotate portrait frames a quarter turn to landscape, but the problem reduces to not being able to correctly copy the texture.
The pattern of artefacts can change quite a lot. They can be all over the place and seem to have a fundamental "brush shape" that is square tile, sometimes small, sometimes large, which seems to depend on the failing work around at hand. They can occur in both the luminance and chroma channels, which results in interesting effects. The "grain" of the artefacts sometimes appears to be horizontal, which I guess is vertical in the original frame.
I do have two functioning work arounds:
- rotate the buffers using
Metal
- rotate the buffers using
CoreImage
(even a "software"CIContext
works)
The reason that I can't yet ship these workarounds is that System Broadcast Extensions have a very low memory limit of 50MB and memory usage can spike with these two solutions, and there seem to be interactions with other parts of the system (e.g. the AVAssetWriter
or the daemon that dumps frames into my address space). I'm still working to understand memory usage here.
The artefacts seem like a synchronisation problem. However I have a feeling that this is not so much a new frame being written into the buffer that I'm looking at, but rather some sort of stale cache. CPU or GPU? Do GPUs have caches? The tiled nature of the artefacts reminds me of iOS GPUs, but that with a grain of salt (not a hardware person).
This brings me around to the question title. If this is a caching problem, and Metal
/ CoreImage
has a consistent view of the pixels, maybe I can get Metal to flush the data I want for me, because an BGRA screen capture being converted to YUV IOSurface
has Metal
shader written all over it.
So I took the incoming CMSampleBuffer
's CVPixelBuffer
's IOSurface
and created an MTLTexture
from it (with all sorts of cacheMode
s and storageMode
s, haven't tried hazardTrackingMode
s yet) and then copied the bytes out with MTLTexture.getBytes(bytesPerRow:from:mipmapLevel:)
.
Yet the problem persists. I would really like to make the CPU deep copy approach work, for memory reasons.
To head off some questions:
- it's not a bytes-per-row issue, that would slant the images
- in the cpu case I do lock the
CVPixelBuffer
's base address - I even lock the the underlying
IOSurface
- I have tried discarding
IOSurface
s whose lock seed changes under lock - I do discard frames when necessary
- I have tried putting random memory fences and mutexes all over the place (not a hardware person)
- I have not disassembled
CoreImage
yet
This question is the continuation of one a posted on the Apple Developer Forums