Bit shifts on buffer of arbitrary bit length

Question

I have been looking for functions doing bit shifts on a buffer of arbitrary bit length in C99. There are probably many libraries doing that but I would like to avoid adding dependencies just for that (but I would be OK to extract functions from a lib if that's possible without pulling too much code). Crucially that shall be able to work in-place and without damaging anything outside of the specified bit length.

To clarify any ambiguity, the question is how to implement the functions below:

/**
 * shift a buffer right with bit granularity (little endian)
 *
 * @param   dst         destination buffer, can be equal to src
 * @param   src         source buffer
 * @param   size        length in bits of src/dst
 * @param   shift       shift amount in bits
 * @param   fill        fill value (boolean) for the MSBs
 *
 * shift is allowed to be larger than size, it behaves like they are equal
*/
void bufshrbits(void* dst,const void* src,size_t size, size_t shift, bool fill);

/**
 * shift a buffer left with bit granularity (little endian)
 *
 * @param   dst         destination buffer, can be equal to src
 * @param   src         source buffer
 * @param   size        length in bits of src/dst
 * @param   shift       shift amount in bits
 * @param   fill        fill value (boolean) for the LSBs
 *
 * shift is allowed to be larger than size, it behaves like they are equal
*/
void bufshlbits(void* dst,const void* src,size_t size, size_t shift, bool fill);

The SO question Bitwise shifting array of char's has an incorrect title, it is actually asking for rotate.

"fill value (boolean) for the LSBs" for "shift a buffer right". --> I'd expect "for the MSBs" — chux - Reinstate Monica, Dec 22 '19 at 04:45
The `const` with the object serves no value in function prototype. It does make it less legible though. — chux - Reinstate Monica, Dec 22 '19 at 05:05
Curious, in C99, why `int fill` instead of `bool fill` for a "fill value (boolean)"? — chux - Reinstate Monica, Dec 22 '19 at 05:10
@chux-ReinstateMonica thanks for your remarks. for the const in the function prototype, I find it useful to implement the function directly (to avoid repeating the function declaration). could you point out an example or some literature which demonstrate the downside ? — acapola, Dec 22 '19 at 12:28
With the C spec, a declaration with a const parameter is an anti-pattern. Do you find reading `void bufshlbits(void *dst, const void *src, size_t size, size_t shift, bool fill);` of equal clarity with `void bufshlbits(void*const dst,const void*const src,size_t size, size_t shift, bool fill);`? — chux - Reinstate Monica, Dec 22 '19 at 19:27
when I read "void*const dst", it tells me that the whenever I read "dst" within the function I am sure to get the original value from the callee. That's why I find it useful but granted it does make the function declaration less readable. — acapola, Dec 22 '19 at 21:59
My comments were only about function declarations. (e. g. prototype) where `const` distracts - there is no "within the function" there. As part of the definition, use of `const` parameters should follow your group's coding standard. — chux - Reinstate Monica, Dec 22 '19 at 22:46

acapola · Accepted Answer · 2019-12-22T12:55:31.263

I came up with the following, it is tested here: https://wandbox.org/permlink/g3FDe88vKgdPsGLC It shall work on any CPU as it is using only byte accesses (not tested on big endian platform though).

#include <stdint.h>
#include <stdbool.h>

/**
 * shift a buffer left with byte granularity (little endian)
 *
 * @param   dst         destination buffer, can be equal to src
 * @param   src         source buffer
 * @param   byte_size   length in bytes of src/dst
 * @param   byte_shift  shift amount in bytes
 * @param   fill8       fill value for the LSBs
 *
 * byte_shift is allowed to be larger than byte_size, it behaves like they are equal
*/
static void bufshl(void*const dst,const void*const src,size_t byte_size, size_t byte_shift, uint8_t fill8){
    if(0==byte_size) return;
    if(byte_shift>byte_size) byte_shift=byte_size;
    uint8_t*const dst8=(uint8_t*)dst;
    const uint8_t*const src8=(const uint8_t*const)src;
    for(size_t i=byte_size-1;i!=byte_shift-1;i--){dst8[i] = src8[i-byte_shift];}
    for(size_t i=0;i<byte_shift;i++){dst8[i] = fill8;}
}
/**
 * shift a buffer left with bit granularity (little endian)
 *
 * @param   dst         destination buffer, can be equal to src
 * @param   src         source buffer
 * @param   size        length in bits of src/dst
 * @param   shift       shift amount in bits
 * @param   fill        fill value for the LSBs
 *
 * shift is allowed to be larger than size, it behaves like they are equal
*/
static void bufshlbits(void*const dst,const void*const src,size_t size, size_t shift, bool fill){
    if(0==size) return;
    const uint8_t fill8 = fill ? 0xFF : 0x00;
    if(shift>size) shift=size;
    uint8_t*const dst8=(uint8_t*)dst;
    const uint8_t*const src8=(const uint8_t*const)src;
    const size_t byte_size = (size+7)/8;
    const unsigned int byte_shift=shift%8;
    const unsigned int cshift = (8-byte_shift)%8;
    const uint8_t last = src8[byte_size-1];
    const size_t lsb = shift/8;
    if(0==(shift%8)){
        bufshl(dst,src,byte_size,lsb,fill8);
    } else {
        uint8_t carry=src8[byte_size-1-lsb];
        for(size_t i=byte_size-1;i!=lsb-1;i--){
            const size_t sidx = i-1-lsb;
            const uint8_t s = sidx>byte_size ?  fill8 : src8[sidx];
            const uint8_t d = (carry<<byte_shift)|(s >> cshift);
            carry = src8[sidx];
            dst8[i] = d;
        }
    }
    for(size_t i=0;i<lsb;i++){dst8[i]=fill8;}
    if(size%8){
        const uint8_t last_mask = 0xFF<<(size % 8);
        dst8[byte_size-1] &= ~last_mask;
        dst8[byte_size-1] |= last & last_mask;
    }
}
/**
 * shift a buffer right with byte granularity (little endian)
 *
 * @param   dst         destination buffer, can be equal to src
 * @param   src         source buffer
 * @param   byte_size   length in bytes of src/dst
 * @param   byte_shift  shift amount in bytes
 * @param   fill8       fill value for the MSBs
 *
 * byte_shift is allowed to be larger than byte_size, it behaves like they are equal
*/
static void bufshr(void*const dst,const void*const src,size_t byte_size, size_t byte_shift, uint8_t fill8){
    if(0==byte_size) return;
    if(byte_shift>byte_size) byte_shift=byte_size;
    uint8_t*const dst8=(uint8_t*)dst;
    const uint8_t*const src8=(const uint8_t*const)src;
    const size_t last=byte_size-byte_shift;
    for(size_t i=0;i!=last;i++){dst8[i] = src8[i+byte_shift];}
    for(size_t i=last;i<byte_shift;i++){dst8[i] = fill8;}
}
/**
 * shift a buffer right with bit granularity (little endian)
 *
 * @param   dst         destination buffer, can be equal to src
 * @param   src         source buffer
 * @param   size        length in bits of src/dst
 * @param   shift       shift amount in bits
 * @param   fill        fill value for the MSBs
 *
 * shift is allowed to be larger than size, it behaves like they are equal
*/
static void bufshrbits(void*const dst,const void*const src,size_t size, size_t shift, bool fill){
    if(0==size) return;
    const uint8_t fill8 = fill ? 0xFF : 0x00;
    if(shift>size) shift=size;
    uint8_t*const dst8=(uint8_t*)dst;
    const uint8_t*const src8=(const uint8_t*const)src;
    const size_t byte_size = (size+7)/8;
    const unsigned int bshift=shift%8;
    const unsigned int cshift = bshift ? (8-bshift)%8 : 8;
    const uint8_t last = src8[byte_size-1];
    const size_t lsb = shift/8;
    if((0==(shift%8)) && (0==(size%8))) {
        bufshr(dst,src,byte_size,lsb,fill8);
    } else {
        const uint8_t last_mask = size%8 ? 0xFF<<(size % 8) : 0;
        uint8_t carry = lsb+1 >=byte_size ? fill8 : src8[lsb+1];
        if(lsb+1 == byte_size-1) {
            carry &= ~last_mask;
            carry |= last_mask & fill8;
        }
        if(byte_size>lsb){
            for(size_t i=0;i<byte_size-lsb-1;i++){
                const size_t sidx = i+lsb;
                uint8_t s;
                if(sidx>=byte_size-1){
                    s=(src8[sidx] & ~last_mask) | (last_mask & fill8);
                }else{
                    s=src8[sidx];
                }
                const uint8_t d = (carry<<cshift)|(s >> bshift);
                carry = sidx+2 >=byte_size? fill8 : src8[sidx+2];
                if(sidx+2 == byte_size-1) {
                    carry &= ~last_mask;
                    carry |= last_mask & fill8;
                }
                dst8[i] = d;
            }
        }
        const size_t sidx = byte_size-lsb-1+lsb;
        carry &= ~last_mask;
        carry |= last_mask & fill8;
        uint8_t s;
        if(sidx>=byte_size-1){
            s=(src8[sidx] & ~last_mask) | (last_mask & fill8);
        }else{
            s=src8[sidx];
        }
        const uint8_t d = (carry<<cshift)|(s >> bshift);
        dst8[byte_size-lsb-1] = d;
    }
    for(size_t i=byte_size-lsb;i<byte_size;i++){dst8[i]=fill8;}
    if(size%8){
        const uint8_t last_mask = 0xFF<<(size % 8);
        dst8[byte_size-1] &= ~last_mask;
        dst8[byte_size-1] |= last & last_mask;
    }
}

These will be considerably slower than methods that shift 32 or 64 bits at a time. Of course that doesn't matter if the data are small or speed isn't very important. — Gene, Dec 22 '19 at 01:48
what a wonderful amount of code for such a simple operation. And your code is almost unreadable with no comment and spaces around operators — phuclv, Dec 22 '19 at 02:09
Also, function names starting with `mem` are reserved for future (library) use. — wildplasser, Dec 22 '19 at 12:08
@wildplasser thanks for this remark, will rename as bufshlbits then — acapola, Dec 22 '19 at 12:16
@Gene I was looking for correctness and reasonable performance, .i.e. avoid functions which set each bit individually (yeah some people do that!). Also using 32 bit does not help on my 16 bit platform, but you are right that in most case using bigger words would result in a speed up. Note that would certainly complicate the code further since this would mean taking care of alignment... — acapola, Dec 22 '19 at 12:37
Using `unsigned` for `bufshlbits()` rather than `uint8_t` adapts to the processors usually preferred integer size. — chux - Reinstate Monica, Dec 22 '19 at 16:37
@chux-ReinstateMonica changing uint8_t to unsigned will not work on platforms that don't support "unaligned" accesses (most embedded platforms I am aware of). — acapola, Dec 22 '19 at 18:00
@acapola Good point, code would need to handle various alignment cases to take advantage of native size. Sounds like a job for memcpy/memmove. — chux - Reinstate Monica, Dec 22 '19 at 19:23

phuclv · Answer 2 · 2019-12-22T12:31:23.990

You can combine the big and small-granularity shift in one step by adjusting the index. For example to do a 18-bit right shift you'll get the bits for the byte at index from the bytes at index - 2 and index - 3

// ...┆abcd...┆       ┆       ┆       ┆       ┆...      ← src
//     |  <- 18-bit ->  |
// ...┆       ┆       ┆..abcd.┆       ┆       ┆...      ← dst
// ...₀₁₂₃₄₅₆₇⁰¹²³⁴⁵⁶⁷₀₁₂₃₄₅₆₇⁰¹²³⁴⁵⁶⁷₀₁₂₃₄₅₆₇ ...      ← bit position

Here's an example code for right shift. For simplicity I choose byte as the big unit shift, but for good performance each unit should be a 32 or 64-bit word, or even better use SIMD to work on 128/256/512 bits at a time. That's why I use "limb" for the unit, which is GMP's term for each digit in the big integer so that it doesn't depend on the unit size and is easier for extension. Just change char to uint64_t or similar and make some small changes

// dst = src >> shift
void memshrbits(void* const dst, const void* const src,
                size_t size, size_t shift, int fill)
{
    char *d = (char*)dst, *s = (char*)src;

    const size_t limbWidth = CHAR_BIT * sizeof(*d); // number of bits in a limb
    assert(shift < limbWidth); // or just fill the whole dst with `fill`
                               // when shift >= limbWidth

    const size_t bitShift  = shift % limbWidth; // number of bits to shift in a limb
    const size_t limbShift = shift / limbWidth; // number of limbs to shift

    size_t srcLength = size / limbWidth;
    size_t numberOfRemainingBits = size % limbWidth;
    // backup the last bits in case `size` is odd
    size_t remainingBits = s[srcLength - 1] & ((1U << numberOfRemainingBits) - 1);

    fill = -fill;  // all ones or zero bits
    size_t i = srcLength - 1;
    // Filling dst from the right, because we're shifting right
    for (; i > limbShift; --i)
        d[i] = (s[i - limbShift    ] >> bitShift) |
               (s[i - limbShift - 1] << (limbWidth - bitShift));
    d[i--] = (s[0] >> bitShift) | (fill << (limbWidth - bitShift));
    // The remaining bits are blank spaces and will be filled with `fill`
    for (; i != (size_t)(-1); --i)
        d[i] = fill;

    // Restore the bits if shift is odd
    d[srcLength - 1] = (d[srcLength - 1] & (-1U << numberOfRemainingBits)) |
                       remainingBits;
}

If size is always a multiple of the unit (i.e. multiples of 8 in this case) it's even easier because you don't have to handle the odd bits at the end of src and you can remove the remainingBits parts

Do the same for left shift but iterate in the reverse direction

I recommend to avoid signed math artifacts and use `unsigned char` and unsigned masks like `1u << ...`. Also avoid length truncation `int i` --> `size_t i` — chux - Reinstate Monica, Dec 22 '19 at 04:50
@chux I used size_t first but then changed to int to avoid changing the for loop to `for (; i != size_t(-1); --i)` to make it easier to read — phuclv, Dec 22 '19 at 04:55
Note: `size ==0` leads to `d[SIZE_MAX] = ...`. Perhaps an initial `if(0==size) return;`? `s[0] >> bitShift` can be a arithmetic right shift replicating bits when a logical right shift is needed. — chux - Reinstate Monica, Dec 22 '19 at 05:22
Thanks for posting a solution however it does not pass the test (end up in seg fault), maybe I did not define CHAR_BIT correctly ? see https://wandbox.org/permlink/DhNWZVC7ncGQ921k — acapola, Dec 22 '19 at 12:14
@acapola `CHAR_BIT` is in limits.h. This is just a PoC implementation. I didn't test or debug it — phuclv, Dec 22 '19 at 12:18
@phuclv I included limits.h but I still get the segfault. sorry your answer is not helpful — acapola, Dec 22 '19 at 12:32
I give ideas and an almost complete implementation. I don't have time for debugging it for you[ — phuclv, Dec 22 '19 at 12:34

Bit shifts on buffer of arbitrary bit length

2 Answers2