1

I am using the raylib and it uses the 32 bit RGBA colours. I searched in a cheatsheet, but I could not find any appropriate routine. What I want to do is multiply two colours as if they were vec4s in opengl (each channel ranging from 0 to 1). I actually already successfully performed the multiplication but it is rather slow operation involving lot of floating point arithmetic for something so simple:

uint8_t multiplyBytes(uint8_t a, uint8_t b) {
    const float invertedByte = 1/255.0;
    return (uint8_t)(((a * invertedByte) * (b * invertedByte)) * 255.0);
}

My question is if there is better way to do this? Or I should stick with this solution?

Jakub Dóka
  • 2,477
  • 1
  • 7
  • 12
  • 1
    Where are the 32 bit colours in this? An `unsigned short` is _usually_ 16 bits. On that topic, use fixed width types, like `uint8_t`, `uint16_t`,`uint32_t` and `uint64_t` instead. – Ted Lyngmo Jun 07 '21 at 14:26
  • well i dont usually use c, i forgot byte is actually char lets fix it – Jakub Dóka Jun 07 '21 at 14:28
  • 2
    May I ask what physical phenomenon are you replicating with the multiplication of two colors? Is only the A channel involved? – Bob__ Jun 07 '21 at 14:39
  • @Bob__ should I also add four consecutive calls of `multiplyBytes` assigning results to appropriate fields of a `Colour` to make it more clear? Or you can imagine that phenomenon yourself? – Jakub Dóka Jun 07 '21 at 14:50

2 Answers2

2

I actually already successfully performed the multiplication but it is rather slow operation involving lot of floating point arithmetic for something so simple:

One step in improving FP math is to avoid using float and double types.

// double to float
const float invertedByte = 1/255.0;  

// uint8_t to float (twice)
// float to double
// double to uint8_t
return (uint8_t)(((a * invertedByte) * (b * invertedByte)) * 255.0);
//                ^---- float product ------------------^    double

A better FP solution would use

const float invertedByte = 1.0f/255.0f;  
// unsigned to float (once)
// float to uint8_t
return (uint8_t)( ((unsigned)a * (unsigned)b) * invertedByte);

Yet an all integer solution - similar to @Jakub Dóka.

return (uint8_t) (((unsigned)a * (unsigned)b + 255u) >> 8);

Or as suggested by @Paul Hankin

return (uint8_t) ((unsigned)a * (unsigned)b + 127u) / 255u;

Also see @Ian Abbott good idea

return ((uint_fast32_t)a * (uint_fast32_t)b * 0x10101u + 0x800000u) >> 24;
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    The integer solution rounds up, with for example 1/255 * 1/255 -> 1/255. Round nearest is `(a * b + 127u) / 255` – Paul Hankin Jun 07 '21 at 15:10
  • @PaulHankin I do see a trade-off being made in the exactness of a fast all integer solution as you point out and do prefer a round-to-nearest approach. Yet that sacrifice nets a `/256` vs. `/255`, a very likely speed improvement. – chux - Reinstate Monica Jun 07 '21 at 15:14
  • The division would probably get optimized to a fast multiplication and a shift though, so might not be so bad. – Ian Abbott Jun 07 '21 at 15:18
  • 1
    Yes, there's a trade-off, but I expect the round-nearest is often preferable because rounding up adds noise with positive mean. – Paul Hankin Jun 07 '21 at 15:18
  • 2
    This should give pretty good rounding to nearest using 32-bit unsigned arithmetic: `return ((uint_fast32_t)a * (uint_fast32_t)b * 0x10101u + 0x800000u) >> 24;`. Using a=127, b=254 as an example results in 0x7f008002u before the right shift (0x7f after the right shift), compared to (127.0 * 254.0 / 255.0 + 0.5) * 0x1000000u) which is 0x7f008081u. – Ian Abbott Jun 07 '21 at 16:36
1

After some searching I realized, I can upgrade this to:

uint8_t multiplyBytes(uint8_t a, uint8_t b) {
    return (uint8_t)(((unsigned short)a * (unsigned short)b + 255) >> 8);
}

Not sure if this is the best solution.

Jakub Dóka
  • 2,477
  • 1
  • 7
  • 12
  • 1
    Jakub, As `char` may be _signed_, `(unsigned)(unsigned char)a` may make more sense. – chux - Reinstate Monica Jun 07 '21 at 14:41
  • Ah, good old skool bit manipulation! I'd to it same way as well, except, perhaps I'd use int instead of unsigned short for typecasting before multiplication, because on most platforms signed integers "may" be faster (due to matching register size and undefined overflow behavior) -- but that can also be controlled via compiler options... – Iiro Ullin Jun 07 '21 at 14:44
  • @chux-ReinstateMonica The question is about RGBA conversion: those values are always in 0...255 range – Iiro Ullin Jun 07 '21 at 14:51
  • @IiroUllin The question is a C one. The question had `char a` (and this answer originally did too) and that is often _signed_ thus not always in the 0..255 range. The answer here wisely changed to unsigned parameters which are 0-255. I'll restate my comment to reflect the update. – chux - Reinstate Monica Jun 07 '21 at 14:53
  • @IiroUllin On 32-bit int machines, (unsigned short)a becomes an int anyways prior to the multiplication. "because on most platforms signed integers "may" be faster" is doubtful. Most platforms in 2021 are small embedded processors. – chux - Reinstate Monica Jun 07 '21 at 14:56