1

I made a little pixel sorting app with pure java, and it works fine, but the performance is bad. I've hear that renderscript is just for that!

I made a little code, but C99 is so new, so i know something is missing. I made this little test script.

#pragma version(1)
#pragma rs java_package_name(com.simahero.pixelsort)
#pragma rs_fp_relaxed

float treshhold = 0.f;

static void swap(uchar4 *xp, uchar4 *yp)
{
    uchar4 temp = *xp;
    *xp = *yp;
    *yp = temp;
}

static void selectionSort(uchar4 arr[], int n)
{
    int i, j, min_idx;

    for (i = 0; i < n-1; i++)
    {
        min_idx = i;
        for (j = i+1; j < n; j++)
        if (arr[j].r < arr[min_idx].r)
            min_idx = j;

        swap(&arr[min_idx], &arr[i]);
    }
}

rs_allocation RS_KERNEL invert(rs_allocation in) {

    for (int i = 0; i < rsAllocationGetDimY(in); i++){
        uchar4 row[rsAllocationGetDimX(in)];
        for (int j = 0; j < rsAllocationGetDimX(in); j++){
            uchar4 pixel = rsGetElementAt_uchar4(in, i, j);
            row[j] = pixel;
        }
        selectionSort(row, rsAllocationGetDimX(in));
     }
  return in;
}

void process(rs_allocation inputImage, rs_allocation outputImage) {
   outputImage = invert(inputImage);
}

I simpy invoke it in an asynctask, but the Bitmap is empty, or i dont know, because of the lack of knowledge of debugging rs.

script.invoke_process(mInAllocation, outputAllocation);
outputAllocation.copyTo(bo);
Zoltán R
  • 33
  • 3

1 Answers1

2

You are doing a copy of every row of your image and then sort it, but you never write back the result (there is no rsSetElement method invoked anywhere). Even when you do so, it dont think you'll have a satisfying performance with that approach. I would approach this by writing a kernel which gets executed over all rows of your input allocation (check out the LaunchOptions of Renderscirpt kernels), so it'll be executed in parallel at least over all rows. That would look like:

rs_allocation allocIn;
rs_allocation allocOut;

void RS_KERNEL sortRows(uchar4 in, int x, int y){
    //with proper launch options, x stays always the same, while this kernel gets called in parallel for all rows of the input allocation ( = different y values)
    for (int currentCollumn = 0; currentCollumn  < rsAllocationGetDimX(allocIn); currentCollumn ++){
       //do your stuff for this row (by using y as index). Use rsGetElementAt and rsSetElementAt calls only (avoid copies for speed)   
    }
 }
SerialSensor
  • 315
  • 2
  • 11
  • I might misunderstand something, but reading Launch options, the kernel will work on a subimage, with the defined coordinates. If so i should loop tru every row in Java code, which would make it non parallel. As mentioned in the doc here: https://developer.android.com/guide/topics/renderscript/compute#single-source-rs I could do a singe-source script and loop tru in that, dont i? – Zoltán R Apr 10 '20 at 21:14
  • ``` public void createScript() { rs = RenderScript.create(this); Allocation in = Allocation.createFromBitmap(rs, bitmap); Allocation out = Allocation.createTyped(rs, in.getType()); Script.LaunchOptions options = new Script.LaunchOptions(); options.setY(0, 1); horizontalSort = new ScriptC_horizontalSort(rs); horizontalSort.set_allocIn(in); horizontalSort.set_allocOut(out); horizontalSort.forEach_sortRows(in, options); } } ``` – Zoltán R Apr 10 '20 at 21:30
  • If you want to loop over y, you should use options.setX(0,1) for the launch options (this freezes the x postions for all calls of your kernel). The forEach_sortRows() call will then exectue your kernel "for each" row (in parallel on cpu or gpu) – SerialSensor Apr 12 '20 at 07:32