1

I want to count the pixels of a bitmap using the following RenderScript code

RenderScript

Filename: counter.rs

#pragma version(1)
#pragma rs java_package_name(com.mypackage)
#pragma rs_fp_relaxed

uint count; // initialized in Java
void countPixels(uchar4* unused, uint x, uint y) {
  rsAtomicInc(&count);
}

Java

Application context = ...; // The application context
RenderScript rs = RenderScript.create(applicationContext);

Bitmap bitmap = ...; // A random bitmap
Allocation allocation = Allocation.createFromBitmap(rs, bitmap);

ScriptC_Counter script = new ScriptC_Counter(rs);
script.set_count(0);
script.forEach_countPixels(allocation);

allocation.syncAll(Allocation.USAGE_SCRIPT);
long count = script.get_count();

Error

This is the error message I get:

ERROR: Address not found for count

Questions

  • Why doesn't my code work?
  • How can I fix it?

Links

winklerrr
  • 13,026
  • 8
  • 71
  • 88

2 Answers2

1

As a side note, it is usually not a good practice to use atomic operations in parallel computing unless you have to. RenderScript actually provide the reduction kernel for this kind of application. Maybe you can give it a try.

There several problems with the code:

  1. The variable "count" should have been declared "volatile"
  2. countPixels should have been "void RS_KERNEL countPixels(uchar4 in)"
  3. script.get_count() will not get you the up-to-date value of "count", you have to get the value back with an Allocation.

If you have to use rsAtomicInc, a good example is actually the RenderScript CTS tests:

AtomicTest.rs

AtomicTest.java

Miao Wang
  • 1,120
  • 9
  • 12
1

Here is my working solution.

RenderScript

Filename: counter.rs

#pragma version(1)
#pragma rs java_package_name(com.mypackage)
#pragma rs_fp_relaxed

int32_t count = 0;
rs_allocation rsAllocationCount;

void countPixels(uchar4* unused, uint x, uint y) {
  rsAtomicInc(&count);
  rsSetElementAt_int(rsAllocationCount, count, 0);
}

Java

Context context = ...;
RenderScript renderScript = RenderScript.create(context);

Bitmap bitmap = ...; // A random bitmap
Allocation allocationBitmap = Allocation.createFromBitmap(renderScript, bitmap);
Allocation allocationCount = Allocation.createTyped(renderScript, Type.createX(renderScript, Element.I32(renderScript), 1));

ScriptC_Counter script = new ScriptC_Counter(renderScript);
script.set_rsAllocationCount(allocationCount);
script.forEach_countPixels(allocationBitmap);

int[] count = new int[1];
allocationBitmap.syncAll(Allocation.USAGE_SCRIPT);
allocationCount.copyTo(count);

// The count can now be accessed via
count[0];
Community
  • 1
  • 1
winklerrr
  • 13,026
  • 8
  • 71
  • 88
  • This isn't right. You can't write to the same memory location with multiple threads without an atomic operation. You should write an invokable single threaded function with that rsSetElementAt call and call after invoking the kernel. – sakridge Jan 24 '17 at 18:44
  • That's why I use the `rsAtomicInc()` – winklerrr Jan 24 '17 at 18:48
  • rsAtomicInc is fine. It's the rsSetElementAt call that is the problem. It is a write into rsAllocationCount from multiple threads. You are not guaranteed to get the correct answer. – sakridge Jan 24 '17 at 19:06
  • It doesn't matter from which thread the last call of `rsSetElementAt_int()` will happen because it uses the count variable which was atomically increased and therefore holds to correct value. – winklerrr Jan 25 '17 at 11:47
  • No. Atomic only ensures that count variable will be correct after execution. Consider a simple example where the program is: "atomic_inc &count; load &count -> r0; store r0 -> rsAllocationCount[0]". You have 2 threads 0 & 1 and one CPU. Thread 0 is scheduled and does the atomic and then loads r0 with count which is now 1. Thread 0 is interupted and now thread 1 is scheduled and finished the program. It increments count to 2 and does the store of 2 to memory. Then thread 0 is scheduled and it then writes r0 to memory which is still 1. Now you have the wrong answer. – sakridge Jan 25 '17 at 19:40
  • 1
    If not for the correctness, you are doing many more writes than necessary. It will be much faster to avoid all those and just do one at the end with the final result. – sakridge Jan 26 '17 at 03:09
  • That's correct, but it doesn't change the fact, that my answer solves my problem. If you post an answer with even better performance I will accept your's as the best answer. – winklerrr Jan 26 '17 at 09:05