0

Background: Given some input bytes B0, B1, B2, B3 and B4, I want to extract selected bits from these 5 bytes and generate an output word.

For example, denoting the nth bit of Bi as Bi[n], I want to be able to write a mapping f : (B0, B1, B2, B3, B4) → B2[4] B3[5] B3[4] B3[3] B3[2] B3[1] B0[5] B0[3] B0[1]. So f(0b11001, 0b01100, 0b10101, 0b10011, 0b11111) would return 0b010011101.

An expression in C that might do this exact example would be

(B2 & 4 << 5) | (B3 << 3) | (B0 & 16 << 2) | (B0 & 4 << 1) | (B0 & 1)

using naive bitmasks and bitshifts.


Question: Is there any way to simplify such an expression to minimize the number of operations that need to be carried out?

For example, I note that B3 is copied in its entirety to some of the bits of the output, so I put it in place using B3 << 3 instead of masking and shifting individual bits. The first thing I thought of were Karnaugh maps since they came in handy in simplifying Boolean expressions, but I realised that since I am extracting and placing individual bits in different parts of a byte there is no simplification possible using Boolean algebra.


Reasoning: The reason why I want to do this is to be able to light the LEDs in a programmer-friendly manner on the BBC micro:bit. I want B0 to B4 to represent which LEDs are on in the physical 5x5 arrangement, but electronically these LEDs are wired in a complex 3x9 configuration. More information on the LEDs can be found here.

Typically a pattern would be stored in memory according to the physical 3x9 arrangement so as to be able to output this pattern to the LEDs in a single instruction, but I want to be able to map a 5x5 pattern to the 3x9 pattern programmatically. However an expression as shown above would require 5 load instructions, 9 bitwise AND/OR operations and 4 logical shifts, which is at least 9 times more inefficient that the normal method.

CH.
  • 556
  • 1
  • 5
  • 16
  • 1
    `as shown above would require` - compiler might optimize it better then you do. Inspect the assembly code to be sure what really does the expression "require". Even so, are "5 load instruction, 9 bitwise operations and 4 logical shifts" _really_ _really_ that much of work? Microbit seems to use at least 16 MHz processor - that is going to take nanoseconds. – KamilCuk Feb 26 '20 at 20:54
  • 1
    @KamilCuk that's a good idea. I will try that now. – CH. Feb 26 '20 at 20:55
  • @KamilCuk regarding your edit: certainly yes, that is a good point. However, for the sake of learning I do wish to try minimizing the fingerprint of this particular operation, especially since it is called many times per millisecond, and in future this will be running alongside other tasks. (+1 for looking up the clock frequency to tell me about the practical impact!) – CH. Feb 26 '20 at 21:07
  • @KamilCuk the compiler has done some black magic and after trying to make sense of the assembly it seems that the compiler has optimised it to `(b0 & 73) | (b3 << 3) | (b2 & ~127)` (a lot better!). Thank you for the great suggestion. I still hope to learn a way / be directed to material to learn to simplify such expressions by hand, but at least I know I can depend on the compiler for such simplifications for now. – CH. Feb 26 '20 at 21:23
  • 2
    As for `be directed` there is [Bit Twiddling Hacks](https://graphics.stanford.edu/~seander/bithacks.html). – KamilCuk Feb 26 '20 at 22:20
  • @KamilCuk thank you for that. I will look into it. – CH. Feb 26 '20 at 23:52
  • B2[4] B3[5] B3[4] B3[3] B3[2] B3[1] B0[5] B0[3] B0[1] equates to (B2 & 8) << 5) | ((B3 & 31) << 3) | ((B0 & 16) >> 2) | ((B0 & 4) >> 1) | (B0 & 1), not (B2 & 4) << 5) | (B3 << 3) | ((B0 & 16) << 2) | ((B0 & 4) << 1) | (B0 & 1), right? Also, the & part needs to be done first as shown here. – Andrew Feb 21 '22 at 15:49

1 Answers1

0

First consider how much each bit needs to be shifted (rather than merely its final position). You can then execute the required shift amount with one command for multiple bits for those groups of input bits where the shift in the same. For example, (B3 & 31) << 3). You might also be able to eliminate the "masking" (done with the bitwise AND, &) if the masked out bits get shifted out.

Andrew
  • 1
  • 4
  • 19