I have recently started to learn the metal framework so I can write some filters for my swift app. I am about to write a metal kernel that dithers a picture based on error diffusion dithering. Each pixel is given a Color and then values are distributed to neighbouring pixels based on the original pixels Color. The values are spread out over the whole image as each pixel is calculated so all the pixels are dependent on each other. The example will be a Floyd stein berg dither.
With the way metal deals with threading this dithering method won’t work. When dithering an image the pixels can only be computed in order from first to last. Is it possible to have a kernel that doesn’t involve threading, or a way to select the whole image array to be computed by a single thread?