How to specify char literal conforming to MISRA C++?

Question

I'm organizing Klocwork rules and clearing out any issues found by static analysis. There are multiple rules applied and currently I'm having an issue with specifying character literals. Let's consider this example:

for (const char* p = str; *p != '\0'; ++p)

As you can see, this is loop iterating over C-String. It's used to utilize unordered_map of constexpr string literals. Performance measurements proved that storing them as std::string increases memory usage and affects performance because of overhead. As this map content is constant, I'm using custom hash functor for C-Strings (again, to avoid conversions and copying string to std::string just to generate hash). Simple answer would be using std::string_view but it's not available in this environment. So the problem comes from condition itself. The condition should check if character is terminating null.

Obviously at first I used !p as it's guaranteed by standard that terminating null resolves to false (no matter what is the real type of char). It results in AUTOSAR C++14 (18-03) error MISRA.STMT.COND.NOT_BOOLEAN which translates to "Condition of if or loop statement has type 'char' instead of 'boolean'".

Ok then, I changed that to explicit comparison p != 0 and it turned out to be MISRA.CHAR.NOT_CHARACTER violation which is "'char' is used for non-character value".

Again, that's valid point as I'm comparing char to int but char is neither int nor unsigned int. Therefore I changed it to *p != '\0' which should directly translate to null character. This in turn gives MISRA.LITERAL.UNSIGNED.SUFFIX violation which is "Unsigned integer literal ''\0'' without the 'U' suffix". Now I'm surprised. Even if char is considered to be unsigned in one compiler, it is not guaranteed to be either signed or unsigned, so I can't hardcode it to any sign. Not even mentioning that there seems to be no way of specifying suffix for character literals. In my opinion it's already false positive as '\0' IS char type and should not require any further conversion or cast. This shows even more visible issue with syntax like uri.find_last_of('/') where I'm looking for specific character, not particular value. This case generates the same error complaining that I did not specify suffix. (uri is std::string)

My guess is that this is false positive from bugged filter implementation. Also it seems like static analysis might be misconfigured as character literals are considered to be integer only in C, not in C++.

As side note I'll add that in first example using *p != char(0) resolved this issue but that's far from preferred solution and can only be used with known integer value of character which is far less flexible and error prone than using literals thus I'm not going to use this workaround.

What are your thoughts about this issue? Maybe someone else already got such Klocwork error and found solution other than disabling rule or suppressing it for every literal character instance. I already have my list of common false positives that quite often come from C++11 and newer standards checked by rules based on MISRA 2008 C++.

score 2 · Answer 1 · answered Apr 02 '20 at 12:23

Obviously at first I used !p as it's guaranteed by standard that terminating null resolves to false

Yes, but the MISRA rules go beyond the standard. Consider something like char* ptr = 0; if(ptr). It is easy to make a slip between if(ptr) and if(*ptr) and this is a common source for bugs. Whether the code is correct or bugged, the reader can't tell the programmer's intention from that line alone.

Similarly, what's the intention with ptr != 0? To check if the pointer is NULL or if the pointed-at data is zero, or if the data specifically is a null terminator at the end of a string?

Therefore MISRA enforces an explicit check. Code like if(ptr != NULL) or if(*ptr != '\0') are the MISRA recommendations and here the programmer's intention is perfectly clear.

Your question has this problem all over it! You type *p at some places and p in some places. const char* p = str; ... p != '\0' is obviously a bug and if that's your actual code, then MISRA just saved you from it.

Therefore I changed it to p != '\0' which should directly translate to null character.

Indeed, this is MISRA compliant code. Again, assuming p is char and not char*.

This in turn gives MISRA.LITERAL.UNSIGNED.SUFFIX violation which is "Unsigned integer literal ''\0'' without the 'U' suffix".

That's nonsense. '\0' is a character constant and of type char in C++. Your tool must confuse this with regular integer constants (decimal, octal or hex) where U suffix is required if the intention is to use them in unsigned arithmetic.

Now MISRA frowns at octal escape sequences in general, but back in MISRA-C:2004 lots of people (yours sincerely included) pointed out to the committee that \0 must be made a valid exception. This was fixed and \0 was made valid in MISRA-C:2004 TC1 published in July 2007. I'm not sure if that fix made it into the original MISRA-C++:2008 or if there is a TC for MISRA-C++ as well.

At any rate, using '\0' for the null terminator is fine and MISRA compliant. Long as you only use it to compare against other char type operands.

Well, that was great example why comparison should be explicit in such cases, of course it was ```*p != '\0'``` but I missed asterisk writing this post. I understand Misra and Autosar rules (well, I don't always agreee) but this question is probably more related to Klocwork filter implementation, as that's what is giving false positives. Essentially using char literals like ```*p != '\0'``` or ```myString.find_last_of('/')``` is causing this ```MISRA.LITERAL.UNSIGNED.SUFFIX``` error. So basically I'm asking about this nonsense :) — Maciej Załucki, Apr 02 '20 at 19:36
@MaciejZałucki Putting a suffix to character constants obviously makes little sense. The MISRA rule is to put suffix on integer constants, such as `1U` instead of `1`, in cases where the purpose is to use unsigned arithmetic (bitwise operations in particular). — Lundin, Apr 03 '20 at 06:29

How to specify char literal conforming to MISRA C++?

1 Answers1