error diffusion dither image using CIfilter

Question

I am trying to dither an image. I have made some swift code which applies the floyd steinberg dither but it takes a long time to process an image as it isn't wrapped in a cifilter, its just swift code. I am thinking that if I can make a custom cifilter that it would be processed on the gpu and speed up the process. However I am not an expert in CIfilter language.

This is my swift code. I have written the error distribution matrix calculations out in full for the sake of clarity.

    internal struct color {
    let r: Int
    let g: Int
    let b: Int
    }

    func ditherImage2(){
    let image = UIImage(named: "image")
    let width = Int(image!.size.width)
    let height = Int(image!.size.height)
    let pixelArray = pixelarray(image)

    func offset(row: Int, column: Int) -> Int {
     return row * width + column
    }

    for y in 0 ..< height {
    for x in 0 ..< width {
    let currentOffset = offset(row: y, column: x)
    let currentColor = pixelArray![currentOffset]
    // get current colour of pixel
    let oldR = currentColor.r
    let oldG = currentColor.g
    let oldB = currentColor.b
        // quantize / reduce the colours to pallet of 6 colours
    let factor = 1;
    let newR = round(factor * oldR / 255) * (255/factor)
    let newG = round(factor * oldG / 255) * (255/factor)
    let newB = round(factor * oldB / 255) * (255/factor)
        pixelArray[currentOffset] = color(r:newR, g:newG, b:newB)

    let errR = oldR - newR;
    let errG = oldG - newG;
    let errB = oldB - newB;

    // distribute the error to the surrounding pixels using floyd stenberg matrix
    let index = offset(row:x+1, column:y)
    let c = pixelArray[index]
    let r = c.r
    let g = c.g
    let b = c.b
    r = r + errR * 7/16.0;
    g = g + errG * 7/16.0;
    b = b + errB * 7/16.0;
        pixelArray[index] = color(r:r, g:g, b:b);

    let index2 = offset(row:x-1, column:y+1  );
    let c2 = pixelArray[index2]
    let r2 = c.r
    let g2 = c.g
    let b2 = c.b
    r2 = r2 + errR * 3/16.0;
    g2 = g2 + errG * 3/16.0;
    b2 = b2 + errB * 3/16.0;
        pixelArray[index] = color(r:r2, g:g2, b:b2);

    let index3 = offset(row:x, column:y+1);
    let c3 = pixelArray[index3]
    let r3 = c.r
    let g3 = c.g
    let b3 = c.b
    r3 = r3 + errR * 5/16.0;
    g3 = g3 + errG * 5/16.0;
    b3 = b3 + errB * 5/16.0;
        pixelArray[index] = color(r:r3, g:g3, b:b3);


    let index4 = offset(row:x+1, column:y+1);
    let c4 = pixelArray[index]
    let r4 = c.r
    let g4 = c.g
    let b4 = c.b
    r4 = r4 + errR * 1/16.0;
    g4 = g4 + errG * 1/16.0;
    b4 = b4 + errB * 1/16.0;
        pixelArray[index] = color(r:r4, g:g4, b:b4);
    }
    }
    }

I Have found this https://github.com/rhoeper/Filterpedia-Swift4 which includes a custom filter for ordered dithering which I could use as a base and attempt to adapt to error diffusion dithering. I would prefer to find an existing custom kernel which does the job before jumping into learning CIfilter language. So I am wondering if anyone has an existing kernel or a link to one?

ordered dithering code

float orderedDither2x2(float colorin, float bx, float by, float errorIntensity)
{
float error = 0.0;
int px = int(bx);
int py = int(by);
if (py == 0) {
if (px == 0) { error = 1.0 / 4.0; }
if (px == 1) { error = 3.0 / 4.0; }
}
if (py == 1) {
if (px == 0) { error = 4.0 / 4.0; }
if (px == 1) { error = 2.0 / 4.0; }
}
return colorin * (error *  errorIntensity);
}     

kernel vec4 ditherBayer(sampler image, float intensity, float matrix, float palette)
{
vec4 pixel = sample(image, samplerCoord(image));
int msize = int(matrix);

float px = mod(pixel.x, msize >= 5 ? float(4.0) : float(msize));
float py = mod(pixel.y, msize >= 5 ? float(4.0) : float(msize));

float red = pixel.r;
float green = pixel.g;
float blue = pixel.b;

if (msize == 2) {
pixel.r = orderedDither2x2(red, px, py, intensity);
pixel.g = orderedDither2x2(green, px, py, intensity);
pixel.b = orderedDither2x2(blue, px, py, intensity);
}

if (msize == 3) {
pixel.r = orderedDither3x3(red, px, py, intensity);
pixel.g = orderedDither3x3(green, px, py, intensity);
pixel.b = orderedDither3x3(blue, px, py, intensity);
}

if (msize == 4) {
pixel.r = orderedDither4x4(red, px, py, intensity);
pixel.g = orderedDither4x4(green, px, py, intensity);
pixel.b = orderedDither4x4(blue, px, py, intensity);
}
if (msize >= 5) {
pixel.r = orderedDither8x8(red, px, py, intensity);
pixel.g = orderedDither8x8(green, px, py, intensity);
pixel.b = orderedDither8x8(blue, px, py, intensity);
}

if (int(palette) == 0) { return vec4(binary(vec3(pixel.r, pixel.g, pixel.b)), pixel.a);                 }
if (int(palette) == 1) { return vec4(commodore64(vec3(pixel.r, pixel.g, pixel.b)),         pixel.a); }
if (int(palette) == 2) { return vec4(vic20(vec3(pixel.r, pixel.g, pixel.b)), pixel.a); }
if (int(palette) == 3) { return vec4(appleII(vec3(pixel.r, pixel.g, pixel.b)), pixel.a); }
if (int(palette) == 4) { return vec4(zxSpectrumBright(vec3(pixel.r, pixel.g, pixel.b)), pixel.a); }
if (int(palette) == 5) { return vec4(zxSpectrumDim(vec3(pixel.r, pixel.g, pixel.b)), pixel.a); }

return pixel;
}

@FrankSchlegel - ive added my swift code. its a basic error distribution dither. the matrix error calculations for neighbouring pixels can be condensed into a loop but I wrote them in full for the sake of clarity. — carl, Jan 03 '20 at 11:02

score 2 · Answer 1 · answered Jan 04 '20 at 18:59

The problem with Floyd-Steinberg dithering is that it's a serial algorithm – the color value of a result pixel depends on pixels that were previously computed. Core Image (and any kind of SIMD parallelization technique) is not very well suited for these kinds of problems. They are designed to perform the same task on all pixels concurrently.

However, I found some approaches for partially parallelizing the computation of independent pixels on the GPU and even an interesting CPU-GPU-hybrid approach.

Unfortunately, Core Image is probably not the best framework for implementing those techniques since CIFilters are limited in what GPU resources they can leverage (no access to global memory, for example). You could instead use Metal compute shaders directly (instead of through Core Image), which will require a lot more support code, though.

If you don't necessarily need error diffusion, you could still use ordered dithering (which can be highly parallelized) to achieve similar results. I also found a nice article about that. The built-in CIDither filter is probably also using this approach.

Referring to ARM article, does Metal allow creating barriers and if so, how is that done? Or, are there other ways to make sure that pixels are only processed when the appropriate surrounding pixels are settled? More generally: Do you think that Floyd-Steinberg dithering should be doable with a custom Metal shader? — meaning-matters, Dec 28 '21 at 15:36

error diffusion dither image using CIfilter

1 Answers1