How to sign extend a varying sized number to 16 bits in C?

Question

The varying sized number could either be 10-bits or 12-bits or 15-bits. I want to sign extend the immediate bits so that the 16-bit integer has the same value as the 10, 12, or 15-bit value.

EDIT

I'm sorry I didn't post the code I wrote, that isn't working. Here it is:

switch (instr):
{
    case IMMI_10:
    int bits_to_cover = 16 - 10;
    signed short amount = (instr & 15);
    if (amount & (1 << bits_to_cover >> 1))
    {
        amount -= 1 << bits_to_cover;
    }
    break;


    case IMMI_12:
    int bits_to_cover = 16 - 12;
    signed short amount = (instr & 15);
    if (amount & (1 << bits_to_cover >> 1))
    {
        amount -= 1 << bits_to_cover;
    }
    break;


    case IMMI_15:
    int bits_to_cover = 16 - 15;
    signed short amount = (instr & 15);
    if (amount & (1 << bits_to_cover >> 1))
    {
        amount -= 1 << bits_to_cover;
    }
    break;
}

This is running on a special machine built for my CSE class in school. It is not running the x86 architecture. It is called the CSE 410 machine. Documentation: https://courses.cs.washington.edu/courses/cse410/18sp/410machine/isaManual.html

A code example (even pseudo code) of what you've tried will help us understand what on Earth you are trying to do. — DeiDei, Apr 14 '18 at 21:01
[What have you tried?](http://mattgemmell.com/2008/12/08/what-have-you-tried/) It helps us understand what you know if you show what you've tried. It also persuades us we aren't just doing your homework for you. — Jonathan Leffler, Apr 14 '18 at 21:01
I don't understand whay your question exactly is. In a x64 architecture, you are limited to 8, 16, 32 or 64 bit values. You can't store 10-bit integers. — Daniele Cappuccio, Apr 14 '18 at 21:02
Storing 10-bit values is not the problem - the operations are. — zx485, Apr 14 '18 at 21:15
Are you numbering the bits starting from LSB = bit 0 or LSB = bit 1? I'm guessing 1, since otherwise sign extending the 15th bit is pointless — samgak, Apr 14 '18 at 22:16
What if you store the sign bit in the least significant bit? It would make mathmatical operations harder, but it's the only way I can think to do such a thing. — Serinice, Apr 14 '18 at 21:15
https://graphics.stanford.edu/~seander/bithacks.html#VariableSignExtend — phuclv, May 05 '18 at 08:02

Jonathan Leffler · Answer 1 · 2018-05-05T06:31:52.113

I'm working on the basis that if, for example, you have a 10-bit number to be sign-extended to 16 bits, then you have two cases to consider:

zzzzzz1x xxxxxxxx
zzzzzz0x xxxxxxxx

The x's are "don't care — must copy" bits in the result. The leading z's are "don't care — will overwrite" bits. And the bit that switches between 0 and 1 in the two examples is the sign bit which must be copied over what are shown as leading z's. I'm also assuming that the bits are numbered from 0 (the least significant bit or LSB) to 15 (the most significant bit or MSB). If you need to number the bits from 1 to 16, then you have some adjustments to make.

Given a function signature:

uint16_t sign_extend(uint16_t value, uint16_t bits)

we can determine the sign bit with:

uint16_t sign = (1 << (bits - 1)) & value;

We can 0-extend a value with a positive (0) sign bit by and'ing the value with the bit mask:

00000001 11111111

We can 1-extend a value with a negative (1) sign bit by or'ing the value with the bit mask:

11111110 00000000

In the code below, I generate the second mask with:

uint16_t mask = ((~0U) >> (bits - 1)) << (bits - 1);

and use the bit-wise inversion to generate the other.

The code avoids making assumptions about what happens when you right-shift a negative value. (See the comment by samgak.) The C standard says this is implementation-defined behaviour, and the usual cases are 'copy the MSB (sign) bit into the vacated bit' (aka arithmetic shift right) or 'set vacated bits to zero' (aka logical shift right). Both are permitted, but a given compiler must use one or the other. This code will work regardless of what the compiler does because it avoids right-shifting signed quantities. (To make up for that, it assumes you can assign signed integers to the corresponding unsigned integer type, and vice versa, even if the signed value is negative. Formally, the standard only requires that to work for the common subset of values — from 0 to <signed-type>_MAX, but I've not heard of systems where this is a problem, whereas I've heard of systems where shifting is handled differently.)

Putting it all together, here's the function I'd use, in a test harness:

#include <assert.h>
#include <stdint.h>

extern uint16_t sign_extend(uint16_t value, uint16_t bits);

uint16_t sign_extend(uint16_t value, uint16_t bits)
{
    assert(bits > 0 && bits < 16);
    uint16_t sign = (1 << (bits - 1)) & value;
    uint16_t mask = ((~0U) >> (bits - 1)) << (bits - 1);
    if (sign != 0)
        value |= mask;
    else
        value &= ~mask;
    return value;
}

#ifdef TEST

#include <stdio.h>

struct TestSignExtend
{
    uint16_t    value;
    uint16_t    bits;
    uint16_t    result;
};

static int test_sign_extend(const struct TestSignExtend *test)
{
    uint16_t result = sign_extend(test->value, test->bits);
    const char *pass = (result == test->result) ? "** PASS **" : "== FAIL ==";
    printf("%s: value = 0x%.4X, bits = %2d, wanted = 0x%.4X, actual = 0x%.4X\n",
            pass, test->value, test->bits, test->result, result);
    return(result == test->result);
}

int main(void)
{
    struct TestSignExtend tests[] =
    {
        { 0x0123, 10, 0x0123 },
        { 0x0323, 10, 0xFF23 },
        { 0x0323, 11, 0x0323 },
        { 0x0723, 11, 0xFF23 },
        { 0x0323, 12, 0x0323 },
        { 0x0C23, 12, 0xFC23 },
        { 0x0323, 13, 0x0323 },
        { 0x1723, 13, 0xF723 },
        { 0x1323, 14, 0x1323 },
        { 0x3723, 14, 0xF723 },
        { 0x0323, 15, 0x0323 },
        { 0xC723, 15, 0xC723 },
        { 0x0123,  9, 0xFF23 },
        { 0x0223,  9, 0x0023 },
        { 0x0129,  8, 0x0029 },
        { 0x03E9,  8, 0xFFE9 },
    };
    enum { NUM_TESTS = sizeof(tests) / sizeof(tests[0]) };
    int pass = 0;

    for (int i = 0; i < NUM_TESTS; i++)
        pass += test_sign_extend(&tests[i]);
    if (pass == NUM_TESTS)
        printf("PASS - All %d tests passed\n", NUM_TESTS);
    else
        printf("FAIL - %d tests failed out of %d run\n", NUM_TESTS - pass, NUM_TESTS);

    return(pass != NUM_TESTS);  /* Process logic is inverted! */
}

#endif /* TEST */

Sample output:

** PASS **: value = 0x0123, bits = 10, wanted = 0x0123, actual = 0x0123
** PASS **: value = 0x0323, bits = 10, wanted = 0xFF23, actual = 0xFF23
** PASS **: value = 0x0323, bits = 11, wanted = 0x0323, actual = 0x0323
** PASS **: value = 0x0723, bits = 11, wanted = 0xFF23, actual = 0xFF23
** PASS **: value = 0x0323, bits = 12, wanted = 0x0323, actual = 0x0323
** PASS **: value = 0x0C23, bits = 12, wanted = 0xFC23, actual = 0xFC23
** PASS **: value = 0x0323, bits = 13, wanted = 0x0323, actual = 0x0323
** PASS **: value = 0x1723, bits = 13, wanted = 0xF723, actual = 0xF723
** PASS **: value = 0x1323, bits = 14, wanted = 0x1323, actual = 0x1323
** PASS **: value = 0x3723, bits = 14, wanted = 0xF723, actual = 0xF723
** PASS **: value = 0x0323, bits = 15, wanted = 0x0323, actual = 0x0323
** PASS **: value = 0xC723, bits = 15, wanted = 0xC723, actual = 0xC723
** PASS **: value = 0x0123, bits =  9, wanted = 0xFF23, actual = 0xFF23
** PASS **: value = 0x0223, bits =  9, wanted = 0x0023, actual = 0x0023
** PASS **: value = 0x0129, bits =  8, wanted = 0x0029, actual = 0x0029
** PASS **: value = 0x03E9, bits =  8, wanted = 0xFFE9, actual = 0xFFE9
PASS - All 16 tests passed

I did make a deliberate error in one of the tests after everything was passing first go, just to ensure that failures would be spotted.

In your code, you might use it like this:

signed short value = …;

switch (instr):
{
case IMMI_10:
    value = sign_extend(value, 10);
    break;
case IMMI_12:
    value = sign_extend(value, 12);
    break;
case IMMI_15:
    value = sign_extend(value, 15);
    break;
default:
    assert(0 && "can't happen!");
}

If the case labels IMMI_10, IMMI_12 and IMMI_15 have the values 10, 12, 15, then you could avoid the switch and simply use an assignment:

signed short value = …;   // Get the value from somewhere
value = sign_extend(value, instr);

I think this answer would be better if you explained why you didn't do it the more obvious way of `value = (value << 6) >> 6;` (for a 10 bit integer). I'm guessing it's because you don't want to rely on undefined behaviour relating to right shifting signed integers, but perhaps that should be part of the answer? — samgak, Apr 15 '18 at 00:29
@samgak: I didn't do it that way because I didn't even think of it as an option. I avoided signed integers because it is implementation-defined whether the MSB or a zero is stored in the vacated positions on right shifting a negative value. Your 'shorthand' depends crucially on arithmetic right shift (copying the MSB into the vacated position) and not logical right shift (copying zero into the vacated position). I don't know what the compilers I use do in that scenario. Because it is implementation-defined, the compilers must document what they do. This code works regardless of what they do. — Jonathan Leffler, Apr 15 '18 at 01:02

How to sign extend a varying sized number to 16 bits in C?

1 Answers1