12

I need to combine two signed 8 Bit _int8 values to a signed short (16 Bit) value. It is important that the sign is not lost.

My code is:

 unsigned short lsb = -13;
 unsigned short msb = 1;
 short combined = (msb << 8 )| lsb;

The result I get is -13. However, I expect it to be 499.

For the following examples, I get the correct results with the same code:

msb = -1; lsb = -6; combined = -6;
msb = 1; lsb = 89; combined = 345; 
msb = -1; lsb = 13; combined = -243;

However, msb = 1; lsb = -84; combined = -84; where I would expect 428.

It seems that if the lsb is negative and the msb is positive, something goes wrong! What is wrong with my code? How does the computer get to these unexpected results (Win7, 64 Bit and VS2008 C++)?

Jonas Stein
  • 6,826
  • 7
  • 40
  • 72
Natalie
  • 445
  • 2
  • 5
  • 18
  • 1
    numbers are represented in http://en.wikipedia.org/wiki/Two%27s_complement . this is what happens when you OR 2 2's complement numbers. don't. – OSH Nov 24 '11 at 14:15
  • 3
    Why are your "signed 8-bit values" stored in an `unsigned short`? – Kerrek SB Nov 24 '11 at 14:21

7 Answers7

25

Your lsb in this case contains 0xfff3. When you OR it with 1 << 8 nothing changes because there is already a 1 in that bit position.

Try short combined = (msb << 8 ) | (lsb & 0xff);

Retired Ninja
  • 4,785
  • 3
  • 25
  • 35
8

Or using a union:

#include <iostream>

union Combine
{
    short target;
    char dest[ sizeof( short ) ];
};

int main()
{
    Combine cc;
    cc.dest[0] = -13, cc.dest[1] = 1;
    std::cout << cc.target << std::endl;
}
Ylisar
  • 4,293
  • 21
  • 27
2

Raisonanse C complier for STM8 (and, possibly, many other compilers) generates ugly code for classic C code when writing 16-bit variables into 8-bit hardware registers. Note - STM8 is big-endian, for little-endian CPUs code must be slightly modified. Read/Write byte order is important too.

So, standard C code piece:

 unsigned int ch1Sum;
...
     TIM5_CCR1H = ch1Sum >> 8; 
     TIM5_CCR1L = ch1Sum; 

Is being compiled to:

;TIM5_CCR1H = ch1Sum >> 8; 
         LDW   X,ch1Sum 
         CLR   A 
         RRWA  X,A 
         LD    A,XL 
         LD    TIM5_CCR1,A 
;TIM5_CCR1L = ch1Sum; 
         MOV   TIM5_CCR1+1,ch1Sum+1 

Too long, too slow.

My version:

     unsigned int ch1Sum;
...
     TIM5_CCR1H = ((u8*)&ch1Sum)[0];
     TIM5_CCR1L = ch1Sum;

That is compiled into adequate two MOVes

;TIM5_CCR1H = ((u8*)&ch1Sum)[0]; 
       MOV   TIM5_CCR1,ch1Sum 
;TIM5_CCR1L = ch1Sum;
       MOV   TIM5_CCR1+1,ch1Sum+1 

Opposite direction:

    unsigned int uSonicRange;
...
      ((unsigned char *)&uSonicRange)[0] = TIM1_CCR2H;
      ((unsigned char *)&uSonicRange)[1] = TIM1_CCR2L;

instead of

    unsigned int uSonicRange;
...
      uSonicRange = TIM1_CCR2H << 8;
      uSonicRange |= TIM1_CCR2L;
elmot
  • 71
  • 1
  • 4
2

Some things you should know about the datatypes (un)signed short and char:

char is an 8-bit value, thats what you where looking for for lsb and msb. short is 16 bits in length.

You should also not store signed values in unsigned ones execpt you know what you are doing.

You can take a look at the two's complement. It describes the representation of negative values (for integers, not for floating-point values) in C/C++ and many other programming languages.

There are multiple versions of making your own two's complement:

int a;
// setting a
a = -a;     // Clean version. Easier to understand and read. Use this one.
a = (~a)+1; // The arithmetical version. Does the same, but takes more steps.
// Don't use the last one unless you need it!
// It can be 'optimized away' by the compiler.

stdint.h (with inttypes.h) is more for the purpose of having exact lengths for your variable. If you really need a variable to have a specific byte-length you should use that (here you need it).

You should everythime use datatypes which fit your needs the best. Your code should therefore look like this:

signed char  lsb; // signed 8-bit value
signed char  msb; // signed 8-bit value
signed short combined = msb << 8  |  (lsb & 0xFF); // signed 16-bit value

or like this:

#include <stdint.h>
int8_t lsb; // signed 8-bit value
int8_t msb; // signed 8-bit value
int_16_t combined = msb << 8  |  (lsb & 0xFF); // signed 16-bit value

For the last one the compiler will use signed 8/16-bit values everytime regardless what length int has on your platform. Wikipedia got some nice explanation of the int8_t and int16_t datatypes (and all the other datatypes).

btw: cppreference.com is useful for looking up the ANSI C standards and other things that are worth to know about C/C++.

tDwtp
  • 480
  • 5
  • 12
2

It is possible that lsb is being automatically sign-extended to 16 bits. I notice you only have a problem when it is negative and msb is positive, and that is what you would expect to happen given the way you're using the or operator. Although, you're clearly doing something very strange here. What are you actually trying to do here?

Dan
  • 10,531
  • 2
  • 36
  • 55
  • Thanks Dan! The data in the file I read in are stored as 8Bit signed integers with always the LSB first followed by the MSB. To use these data I need to combine them into a 16 Bit signed short. I read in the file as char and convert to _int8. How to do this correctly? – Natalie Nov 24 '11 at 14:27
  • @Natalie: I'm afraid I don't think what you say makes sense. How do individual bytes of a multi-byte integer representation even *have* a notion of "sign"? Why don't you just read the two bytes into a signed short an be done with it? – Kerrek SB Nov 24 '11 at 14:30
  • Sorry, I have not invented this file format. I have to read in the whole file as char because the file contains also ASCII parts and not all data in the file have to be combined into shorts. I need first to separate the data to get those stored as 8 Bit signed int. – Natalie Nov 24 '11 at 15:07
1

You wrote, that you need to combine two 8-bit values. Why you're using unsigned short then? As Dan already said, lsb automatically extended to 16 bits. Try the following code:

uint8_t lsb = -13;
uint8_t msb = 1;
int16_t combined = (msb << 8) | lsb;

This gives you the expected result: 499.

maverik
  • 5,508
  • 3
  • 35
  • 55
  • I tried to use uint8_t but my compiler does not know this type and #include results in no such file or directory error. Maybe a VS2008 problem? Thanks! – Natalie Nov 24 '11 at 14:47
  • For VS use `unsigned __int8` and `__int16` types. – maverik Nov 24 '11 at 15:01
1

If this is what you want:

msb: 1, lsb: -13, combined: 499
msb: -6, lsb: -1, combined: -1281
msb: 1, lsb: 89, combined: 345
msb: -1, lsb: 13, combined: -243
msb: 1, lsb: -84, combined: 428

Use this:

short combine(unsigned char msb, unsigned char lsb) {
    return (msb<<8u)|lsb;
}

I don't understand why you would want msb -6 and lsb -1 to generate -6 though.

Jan
  • 1,807
  • 13
  • 26
  • You are right! Sorry, I mixed up msb and lsb for this example and corrected it in my post. Thanks! – Natalie Nov 24 '11 at 15:45