How to convert 2 bytes into a signed short in C

Question

I have 2 bytes that I need to convert to a signed short number. For example, I have separate bytes (0000 0001) and (0000 0002) in binary. How can I convert these to a signed short value?

What is the result you want to achieve with the two numbers 0000 0001 and 0000 0002? — Ignatius, Jun 22 '19 at 14:54
Given the tags you have applied, you at least know what an answer involves - in which case show us your attempt and explain how it does not work. The fact that you don't know the answer but know it requires bit/byte shifting suggests that this is a homework question. I have deleted my answer for the time-being - this is not a homework cheating site. The industry does not need any more graduates who cannot really code for themselves. Moreover it is almost certainly a duplicate. — Clifford, Jun 22 '19 at 16:17
@clifford: It's actually not a very good homework question because the naïve solution is incorrect; it involves casting (usually implicitly) an out-of-range value to a signed `short`. It's a bit depressing how much of what passes for "teaching C programming" actually transmits bad habits and misconceptions. I suspect that my answer, while I insist that it is correct, is not what the professor is looking for in this case. — rici, Jun 22 '19 at 19:28
@rici : It was probably not a good homework question made worse perhaps by the attempt to hide the fact that it was a homework question. — Clifford, Jun 22 '19 at 21:25
@clifford: SO does not require homework to be identified as such (see https://meta.stackoverflow.com/a/277881/1566221; the homework tag was removed a long time ago). I get annoyed by homework questions, too, particularly raw homework dumps, but I can also relate to the SO policy that a good question is a good question and a bad question should be improved or disappeared. Anyway, students who try to avoid thinking about their homework problems are really only harming themselves by not taking advantage of the best way to actually learn what they have set out to learn. — rici, Jun 22 '19 at 22:10

rici · Answer 1 · 2019-06-22T18:01:40.107

If the bytes are held in a signed datatype, such as signed char or int8_t, then it is pretty straightforward:

signed short combine_signed(signed char byte1, signed char byte2) {
  return byte1 * 256 + (uint8_t)byte2;
}

Multiplication is used here, rather than a shift operation, but it is expected that the compiler will actually insert an appropriate shift operation. The C standard does not specify the result of left shifting a negative number, so a left shift cannot be used in portable code.

If the bytes are in an unsigned type or a type wider than 8 bits, then the simplest approach is to first convert the high-order byte to a signed value and then proceed as above. Converting to a signed value cannot be done with a simple cast because such a conversion would be an integer overflow, whose results are not specified by the C standard. So a portable program must explicitly test the high order bit:

signed short combine(int byte1, int byte2) {
  // This code assumes that byte1 is in range, but allows for the possibility
  // that the values were originally in a signed char and so is now negative.
  if (byte1 >= 128) byte1 -= 256;
  return byte1 * 256 + (uint8_t)byte2;
}

(Both gcc and clang for x86, compiled with -O2 or better, manage to reduce that to a simple three-instruction sequence without multiply or conditional.)

A common scenario for this is for example an 8 bit micro with a 12 bit ADC with "left-justification" (i.e the least-significant four bits of the LSB always zero) such that you read two 8 bit values and interpret them as a 16 bit two's complement value. In such a scenario, you would read the "bytes" as two `uint8_t`, so the rule cited would not apply, and the implicit promotion would be safe. The answer is correct, has unlikely typing - the question lacks the necessary information for a likely answer. — Clifford, Jun 22 '19 at 21:20
@clifford: i provided answers for two typing scenarios. (`uint8_t` is semantically identical to `int` because of arithmetic promotions.) Surely one of those is likely, no? With `uint8_t` the shift would be safe but not the (implicit) cast to a signed return value. — rici, Jun 22 '19 at 21:30
@rici : Perhaps not how an embedded systems engineer would write it though - I appreciate this is not tagged "embedded" but it has that smell. I'm undeleting my answer to show my take on it. It covers all the issues you've raised, but in a manner I would suggest is more idiomatic - comments invited! — Clifford, Jun 22 '19 at 21:36

Clifford · Answer 2 · 2019-06-22T22:59:29.040

Given:

char msb = 0x01 ;
char lsb = 0x02 ;

Then:

short word = (msb << 8) | (lsb & 0xff) ;

will result in word having the value 0x0102 (or 258₁₀).

Since you asked for a signed short, however that is not a very interesting example. For:

char msb = 0x80 ;
char lsb = 0x02 ;

word would have 0x8002, which for a 16 bit short would be -32766.

However on an implementation where short were longer than 16 bits (as is allowed), the result will be interpreted as +32770. It is far safer in this circumstance to use the fixed sized int16_t type defined in stdint.h to avoid any potential implementation dependency.

 int16_t word = (msb << 8) | (lsb & 0xff) ;

This can be simplified somewhat by using uint8_t instead of char which may be either signed or unsigned:

uint8_t msb = 0x80u ;
uint8_t lsb = 0xFFu ;
int16_t word = (msb << 8) | lsb ;

Will result in word = -32513, whereas if lsb and msb were char and char were signed in the implementation, then the result would be -1 due to implicit type promotion and sign extension of lsb.

This remains not strictly well defined because, the left-hand expression promotes to unsigned int and can result in a value not representable as a int16_t, and in that case the behaviour is implementation defined. That said it would be an unusual implementation that did anything other then simply copy the bits verbatim, which is why it works, and the above is idiomatic.

If short is explicitly required, to guarantee a correctly signed result regardless of the length of short, you can explicitly cast to int16_t and assign to a short (or even an int):

 short word = (int16_t)((msb << 8) | (lsb & 0xFF));

A solution is also possible using a union, but given the tags on this question, it seems unlikely that it is an acceptable solution in this case. It has the merit of avoiding any implementation defined behaviour and arcane type promotion and implicit conversion rules, but you do have to deal with endian-ness:

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
  #define LSB 0
  #define MSB 1
#else
  #define LSB 1
  #define MSB 0
#endif

union
{
    int16_t word ;
    uint8_t byte[2] ;
} reinterpret ; 

reinterpret.byte[MSB] = 0x80u ;
reinterpret.byte[LSB] = 0xFFu ;

short word = reinterpret.word ;

https://onlinegdb.com/Byth1N3yr

I appreciate that on most implementations, this will work flawlessly, and it probably is idiomatic. But I insist: `int16_t word = (msb << 8) | lsb ;` where `msb` contains a value greater than equal to 0x80 is unspecified because the right-hand side value is not representable as an `int16_t`, leading to 6.3.1.3/3: "Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised." If your implementation defines the result, fine. GCC does, for example. But it's not fully portable. — rici, Jun 22 '19 at 21:46
In C99+ it is possible to do the union reinterpretation in a single expression: `(union reinterpret){.byte = {[LSB] = 0x80u, [MSB] = 0xFFu}}.word;` — Antti Haapala -- Слава Україні, Jun 23 '19 at 03:13
@AnttiHaapala In this case, reinterpret is a variable name not a union tag, but I see your point. It is not (yet) valid C++, which is what I generally use, so I would not have come up with that. — Clifford, Jun 23 '19 at 07:07

score 0 · Answer 3 · answered Jun 22 '19 at 15:04

0

Assuming 0x01 is the MSB and 0x02 the LSB, then unsigned short foo = 0x01 << 8 | 0x02; would be enough. However, this would imply that unsigned short is at least 16-bit (it depends on the implementation, search for stdint.h for fixed-size)

answered Jun 22 '19 at 15:04

Sebastian

69
4

"How can I convert these to a **signed** short value?" – Weather Vane Jun 22 '19 at 15:14
3

1) The question asks for signed short, not unsugned short. 2) The language _guarantees_ `short` to be _at least_ 16 bits, so it is not implementation dependent. For signed it would be a problem if `short` were longer than 16 bits. – Clifford Jun 22 '19 at 15:51
@Clifford difference between signed and unsigned is irrelevant. The only thing that matters is how you represent it. – Irelia Jun 22 '19 at 16:15
2

@Nina : How is it not relevant if that if what the question specifies? Moreover it is entirely relevant in any event. If you had `msb=0xFF` and `lsb = 0xFF` and did `unsigned short x = msb << 8 | lsb ;` then later assigned `x` to `int32_t y = x ;` then `y` would contain 65535 rather than -1 as it would contain if `x` were a signed short. Thinking that type agreement is irrelevant will cause some interesting bugs in your code. – Clifford Jun 22 '19 at 16:44
@Clifford I see. – Irelia Jun 22 '19 at 16:50
@Nina : Also the literal values have implicit type `int`, if the variables have `char` type and the implementation dependent type of `char` is signed, then `short x = msb << 8 | lsb ;` will fail if `lsb` > 127 due to implicit promotion to `int` and sign-extension. – Clifford Jun 22 '19 at 16:57
`short` is always at least 16 bits – CoffeeTableEspresso Jun 22 '19 at 20:10
@Clifford Thank you very much for both your comments and your answer. Now, as a new comer here I am not sure if I should delete my answer since the others are better (and mine is misleading regarding size of short and the sign) or leave it here since the flaws are pointed out in the comments so it might be useful for some readers. – Sebastian Jun 22 '19 at 21:49
Well it seems to have avoided down-votes so far, so you might let it stand, but if it does attract down-votes, it will hit your rep if that is important to you. Deleting or not is your choice, but it would protect your rep. – Clifford Jun 22 '19 at 21:58
@Clifford I'll let it stand here. Wish you a great day, thanks again. – Sebastian Jun 22 '19 at 22:01

How to convert 2 bytes into a signed short in C

3 Answers3