2

I'd like to set bytes from a bitmap to memory, setting bytes which value is not equal to a given transparency byte value.

Schematically I'd like to do :

for (char *src=start;src<end;src++,dst++) 
{
    if (*src!=VALUE) {
       *dst=*src;
    }
}

i.e. setting only bytes that are different from a value, in C or assembly (or C back translated from assembly)

To be faster, I've considered using 32bits loads, the SEL operation between src and dst, and a 32bit store. However, I need to set the mask, which is in APSR.GE.

If i'm not wrong, doing a SIMD comparison (using USUB8) with VALUE will only check whether the result is >= or < to VALUE, it's not possible to check if they're equal. (of course you could restrict VALUE to 0 or 255 and call it a day ...)

Another possibility would be to use a precomputed mask on src and then setting manually APSR.GE (is it possible?) but 1) it uses memory, 2) it's not always feasible to have the data before 3) not sure if it will really be faster than a byte by byte access.

artless noise
  • 21,212
  • 6
  • 68
  • 105
makapuf
  • 1,370
  • 1
  • 13
  • 23
  • It might be good to know what is VALUE? Just a single bit set value? Can src or dst can contain that VALUE bit set? With those information one can also develop a copy with logical expressions getting rid of if statement. – auselen Sep 07 '12 at 13:33
  • VALUE is a byte, say 0x1c by example (could be another one). src will most definitely contain 0x1c values (to say 'transparent') ; dst could also contain it if it has it before being pasted. (Think of a transparent color, like in GIFs by example). – makapuf Sep 07 '12 at 14:26
  • You probably know what I want to say, if you check link to wiki, you can see it is possible to use AND/OR to do this kind of blitting, but of course if setting values on your reach. http://en.wikipedia.org/wiki/Bit_blit#Technique – auselen Sep 07 '12 at 14:28

2 Answers2

4

Exact syntax escapes me for now but how about something like this:

  • load four bytes from existing image into Ra (LDR)
  • load four bytes from source image into Rb (LDR)
  • XOR Ra with appropriate mask (~VALUE) to change VALUE to be 0 (EOR)
  • XOR Rb with same mask as above (EOR)
  • Do the USUB8 with a register with 0 in to set the GE flags (USUB8)
  • Use SEL to select between the existing image bytes and the source image bytes, write in Rc (SEL)
  • XOR Rc with mask again to restore original bytes (EOR)
  • Write Rc back into existing image (STR)
Pete Fordham
  • 2,278
  • 16
  • 25
2

You may not need this any more, but for newcomers that may need similar algorithm, here's what I would suggest:

Having

  • VALUE_4 : 4 byte VALUE (Byte replicated on all 4 bytes)
  • SRC : 4 image bytes
  • DST : 4 destination bytes

Using USUB8 strict comparison (GE means ">= 0" so not GE means "< 0" ):

  • USUB8(SRC, VALUE_4) => Set GE bits
  • DST = SEL(DST, SRC) => select bytes from Src which value is strictly inferior to VALUE
  • USUB8(VALUE_4, SRC)
  • DST = SEL(DST, SRC) => select bytes from Src which value is strictly superior to VALUE

Your loop would consist of 7 operations (2 Loads, 1 Store, 2 USUB8, 2 SEL) plus loop management.

Brad Larson
  • 170,088
  • 45
  • 397
  • 571
Thibaut
  • 21
  • 1