5

I am implementing an image processing filter within OTB (a C++ library).

I got a strange behaviour with some very basic testing. When I read my image as "char", the following code always outputs the pixel value (in the 0-200 range) even if it is larger than 150. However, it works fine when I use "short int". Is there a special behaviour with "char" (like a letter comparison instead of its numerical value) or could there be any other reason ?

Of course, my image pixels are stored in Bytes, so I prefer to handle "char" instead of "int" because the image is quite large (> 10 Gb).

if (pixelvalue > 150)
{
out = 255;
}
else
{
out = pixelvalue;
}
radouxju
  • 183
  • 6
  • 2
    `char` possible values are between : -128 to 127, so compare to 150 is like to compare to -106 – Garf365 Mar 03 '16 at 11:32
  • 6
    @Garf365: No, possible values are implementation-specific, and you can't count on negative values. For that, use `signed char`. Also, both sides are promoted to int, so the 150 stays unchanged. – MSalters Mar 03 '16 at 11:33
  • `should I investigate a possible bug in the library` Too self-assured. You probably better take a refresher on C++ programming: [The Definitive C++ Book Guide and List](http://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list) (before bothering library implementers) – Ivan Aksamentov - Drop Mar 03 '16 at 11:39
  • @Drop I must admit that I feel stupid after assuming that my char would be unsigned by default, but my output was unsigned so it didn't ring the bell. And I did find a casting issue with this library in the past, so even a beginner can make useful findings for library implementers. – radouxju Mar 03 '16 at 11:53
  • @radouxju: `char` gets **promoted** to `int` in most expressions, and `int` has a minimum range of -32767, +32767. This is even the case in something as simple as `'a' + 0`. You may think +0 does nothing, but it triggers the promotion. – MSalters Mar 03 '16 at 11:59
  • @MSalters OK, so my original byte image is converted to signed char because I did not specify uint8_t or unsigned char, then the char is not promoted when I use ">" contrary to most expressions and it is finally written as an unsigned char when I assign it to the image pixel (also declared as char) – radouxju Mar 03 '16 at 18:20
  • @radouxju: `>` will promote your `char`. However, it appears you had a negative `char` value to start with, which will be promoted to a negative `int` value. – MSalters Mar 03 '16 at 19:23
  • @MSalters OK, so 150 is interpreted as -106 by the reader, therefore my '>' does not provide what I need, but -106 is re-interpreted as 150 by the writer (because my tif file does not support signed char). – radouxju Mar 04 '16 at 06:22

2 Answers2

12

unsigned char runs to (at least) 255, but char may be signed and limited to 127.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • I tought that my "char" was unsigned by default. But how come that my code's output where in 0-255 if a signed char was used in the test. – radouxju Mar 03 '16 at 11:38
  • 1
    @radouxju: It's possible, but the quickest way to figure this out is to check `CHAR_MIN` and `CHAR_MAX`. – MSalters Mar 03 '16 at 11:40
5

The type char is either signed or unsigned (depending on what the compiler "likes" - and sometimes there are options to select signed or unsigned char types), and the guaranteed size is 8 bits minimum (could be 9, 16, 18, 32 or 36 bits, even if such machines are relatively rare).

Your value of 150 is higher than the biggest signed value for an 8-bit value, which implies that the system you are using has a signed char.

If the purpose of the type is to be an integer value of a particular size, use [u]intN_t - in this case, since you want an unsigned value, uint8_t. That indicates much better that the data you are working on is not a "string" [even if behind the scenes, the compiler translates to unsigned char. This will fail to compile if you ever encounter a machine where a char isn't 8 bits, which is a safeguard against trying to debug weird problems.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • unsigned char solved my problem, but thanks for the tip with uintN_t – radouxju Mar 03 '16 at 11:56
  • My point is that if it's not actually a character/string, then using `uint8_t` is the type to use, since you are actually using it as a "small integer". Even if C and C++ doesn't make that much of a distinction and mixing the two will probably work just fine, using the "right name" for the type helps others understand the meaning. – Mats Petersson Mar 03 '16 at 11:59
  • If the requirement is for a type that's **exactly** 8 bits wide, then `uint8_t` can be appropriate. In most cases, though, that's an unnecessary limit on portability. `uint_least8_t` or `uint_fast8_t` would be a better choice, since, unlike `uint8_t`, they exist on all systems. – Pete Becker Mar 03 '16 at 16:12
  • @PeteBecker: And if you are reading a file that contains "bytes", such as pixels in a bitmap, you probably want the EXACT right size [and if the architecture is weird such that it can't deal with 8-bit values directly, you probably want to do something other than read every 4th one as 32-bit values...] – Mats Petersson Mar 03 '16 at 16:54