2

I encountered a surprising False Negative in our C++ Static Analysis tool.

We use Klocwork (Currently 2021.1),
and several colleages reported finding issues KW should have found.

I got example down to as simple as:

int theIndex = 40;

int main()
{
  int arr[10] = {0,1,2,3,4,5,6,7,8,9};
  return arr[theIndex];
}

Any amateur can see I am definitely accessing out of bound array member [40] of the array [0..9].
But KW does not report that clear defect!

TBH, I used CppCheck and SonarQube too, and those failed too!

Testing an more direct flow like:

int main()
{
  int theIndex = 40;
  int arr[10] = {0,1,2,3,4,5,6,7,8,9};
  return arr[theIndex];
}

does find the abundant issue.

My guess was that KW does not see main() as the entrypoint, therefore assume theIndex might be changed before it's called.

I also tired a version that 'might work' (if there is another task that synchronizes perfectly)

int theIndex;

int foo() {
  const int arr[10] = {0,1,2,3,4,5,6,7,8,9};
  return arr[theIndex];
}

int main()
{
  theIndex = 40;
  return foo();
}

Which CppCheck found as "bug free".

My Question is:

  • Am I mis-configuring the tools?
    • what should I do?
  • Should KW catch this issue or is it a limitation of SA tools?
  • Is there a good tool that is capable of catching such issues ?

Edit:

as @RichardCritten assume SA Tools realize other Compilation Units can change the value of theIndex therefore does not indicate the problem.
which holds true as declaring static int theIndex = 40 Does indicate the issue.

Now I wonder:
KW is fed with the full build-spec,
so theoretically, the tool could trace all branching of the software and track possible values of theIndex (might be a computational limitation).

  • Is there a way to instruct the tool to do so?
    • somewhat as a 'link' stage?
Tomer W
  • 3,395
  • 2
  • 29
  • 44
  • You should also consider tags for Coverity and Parasoft as well as other static analysis tools. – Thomas Matthews Feb 01 '23 at 22:10
  • 1
    In case 1 there is no guarantee that `theIndex` is still `40` when `main` is entered. `theIndex` is an externally visible symbol and another translation unit may change it's value when initialising another global object. – Richard Critten Feb 01 '23 at 22:10
  • I would suggest that this is a question for the tool vendor given that you will have paid for support. I am not familiar with that specific tool but does it have data flow analysis? – Clifford Feb 01 '23 at 22:12
  • @Clifford we already sent a support ticket, I got too curious maybe someone knows :) – Tomer W Feb 01 '23 at 22:13
  • @RichardCritten why you say theIndex is not 40 when entering main(), this is the entry-point of the program, so no other CU can interfere. but you are right, with `static int theIndex = 40` CppCheck` got it right. – Tomer W Feb 01 '23 at 22:17
  • @TomerW Even if any of those tools detected the error in your toy program, the problem is just that -- your program is a toy program. Imagine if `theIndex` was set with more complex code -- how would static analysis tools figure out that the value is out-of-bounds? – PaulMcKenzie Feb 01 '23 at 22:18
  • _"... why you say theIndex is not 40 when entering main()..."_ I did not say this. Globals variables are initialised before `main` is entered. There is nothing stopping the initialisation of a global in another translation unit changing the value of `theIndex` before `main` is entered. – Richard Critten Feb 01 '23 at 22:19
  • The various sanitizers are much better than these tools. Google for address sanitizer. Even a compiler cant always grok c++ code it's supposed to compile and crashes, just check out gcc bugzilla. – user1095108 Feb 01 '23 at 22:28
  • 1
    Unless the analysis tool instruments the code by building guard bytes around the array, and at runtime, checks to see if you've changed those bytes, I don't see a static analysis tool being able to detect this (easily). – PaulMcKenzie Feb 01 '23 at 22:30
  • 1
    For your last question: I think this only really works if you have no dynamic libraries at all and the tool can see _all_ of the translation units (including e.g. the standard library) or if you can tell the tool to make assumptions similar to e.g. those made by [`-fwhole-program`](https://gcc.gnu.org/onlinedocs/gcc-6.2.0/gccint/WHOPR.html). I don't know whether the tools allow that. Any shared library that will be linked can also (depending on visibility) change the value of the variable before `main` is entered. – user17732522 Feb 01 '23 at 22:56
  • seems to me as an odd design decision they made, as it is so abundantly clear and the False-Positive is way-way more unlikely to happen. anyway, I think I'd have to wait and hear from support, to be sure, for the time be this pretty much explains it. – Tomer W Feb 02 '23 at 06:40

1 Answers1

4

My guess was that KW does not see main() as the entrypoint, therefore assume theIndex might be changed before it's called.

theIndex can in fact be changed before main is entered. Every initializer of a global variable anywhere in the program can execute arbitrary code and access all global variables. So the tool would potential produce a lot of false positives if it assumed that all initial values of global variables remain unchanged until main is entered.

Of course this doesn't mean that the tool couldn't decide to warn anyway, risking false positives. I don't know whether the mentioned tools are configurable to do so.

If this is intended to be a constant mark it as constexpr. I then expect tools to recognize the issue.

If it is not supposed to be a constant, try to get rid of it. Global variables that aren't constants cause many issues. Because they are potentially modified by any call to a function whose body isn't known (and before entry to main or a thread), they are difficult to keep track of for humans, static analyzers and optimizers alike.

Giving the variable internal linkage may simplify the analysis, because the tool may be able to prove that nothing in the given translation unit could be accessed from another translation unit to set the value of the variable. If there was anything like that, then a global initializer in another unit may still modify it before main is entered. If that is not the case and there is also no global initializer in the variable's translation unit that modifies it, then the tool can be sure that the value remains unchanged before main.

With external linkage that doesn't work, because any translation unit can gain access to the variable simply by declaring it.

Technically I suppose a sufficiently sophisticated tool could do whole-program analysis to verify whether or not the global variable is modified before main. However, this is already problematic in theory if dynamic libraries are involved and I don't think that is a typical approach taken by static analyzers. (I could be wrong on this.)

user17732522
  • 53,019
  • 2
  • 56
  • 105
  • Thats seems like the angle here, `static int theIndex` doesn't cut it, didn't try constexpr. and in my coleage work, it wans't a global variable, but a getter from another CU. but same as the global, it Could change. – Tomer W Feb 01 '23 at 22:48