3

I ran the Fortify Static Code Analyzer on the ossec-hids repo and it reported the following Buffer Overflow: Format String finding for src/analysisd/stats.c:415:

The format string argument to fscanf() at stats.c line 415 does not properly limit the amount of data the function can write, which allows the program to write outside the bounds of allocated memory. This behavior could corrupt data, crash the program, or lead to the execution of malicious code

The line of code in question is like this:

if (fscanf(fp, "%d", &_RWHour[i][j]) <= 0) {

_RWHour is declared as

static int _RWHour[7][25];

on line 33 of the same file. I believe there is no shadowing of _RWHour between this declaration on line 33 and its use on line 415, as when I select the declaration on line 33, my IDE (Visual Studio 2019) highlights _RWHour in 415.

When I look at the cppreference documentation for fscanf, it says the following:

d matches a decimal integer. The format of the number is the same as expected by strtol with the value 10 for the base argument

The table I took the quote above from also shows that when no length modifier is used for %d (as is the case for the fscanf call in question), the argument type should be signed int* or unsigned int*.

My question is this:

Is it possible for the Fortify finding to be a false positive in this instance? Or is it possible to write to memory outside an int when you pass the address of an int to fscanf?

If it is possible to write outside the memory of an int when using %d with fscanf, how can this be safely avoided?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
Shane Bishop
  • 3,905
  • 4
  • 17
  • 47
  • 2
    It should be safe, provided the int pointer is valid. In your case, this means that `i` and `j` must be in the proper range. – Tom Karzes Jun 23 '23 at 18:17
  • 1
    The code can write outside the array if `i` and `j` are not properly limited. You could change to code to read the value into a plain `int`, and then store the value into the array. Just to see if that changes the error message. – user3386109 Jun 23 '23 at 18:20
  • 2
    There is no buffer overflow, but using `fscanf` to read integers has other possible (and unavoidable) UB when the value read cannot be represented by the object's type. – Ian Abbott Jun 23 '23 at 18:22
  • 2
    Also note that an identifier that starts with an underscore followed by an uppercase letter is technically illegal. Renaming the array will also confirm that the array is not being shadowed. – user3386109 Jun 23 '23 at 18:23
  • 2
    @Shane Bishop, Please try `if (fscanf(fp, "%4d", &_RWHour[i][j]) <= 0) {` and let us know if report still exists. – chux - Reinstate Monica Jun 23 '23 at 18:58
  • Can you make Fortify give you more information? Tools like pc-lint will tell you exactly why they think a buffer overflow can happen. – pmacfarlane Jun 23 '23 at 20:14

3 Answers3

5

Prevent conversion overflow

fscanf(fp, "%d", &_RWHour[i][j]) is undefined behavior (UB)*1 if the numeric text attempts to convert to a value outside the int range.

A quick fix to prevent UB is to limit the number of characters read with a width:

//fscanf(fp, "%d", &_RWHour[i][j])
fscanf(fp, "%4d", &_RWHour[i][j])  // Limit [-999 ... 9999].

A more robust solution reads in the text and uses strtol() to convert.

I recommend creating a helper function to handle reading in the int.

This level of checking kinda stinks, doesn't it?

i, j in range

Review of OP's referenced code looks OK, yet analysis tool may be whining about that.


*1

... or if the result of the conversion cannot be represented in the object, the behavior is undefined. C23dr § 7.23.6.2 10

With UB, it is "possible to write to memory outside an int".

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    Do you have a reference for your first sentence? This seems really weird if true, but nothing would surprise me. Also, if this is the reason the static analyser is complaining, it should complain about just about any use of `fscanf("%d"...)`. – pmacfarlane Jun 23 '23 at 19:08
  • 3
    @pmacfarlane Cite added. – chux - Reinstate Monica Jun 23 '23 at 19:11
  • @pmacfarlane What result when code uses `fscanf(fp, "%4d", &_RWHour[i][j])`? – chux - Reinstate Monica Jun 23 '23 at 19:12
  • I don't know, I've not tried any of it. I'm not the OP. – pmacfarlane Jun 23 '23 at 19:13
  • @pmacfarlane Fair point, non-OP. I'd expect a complaint for all `fscanf("%d", ...)` usages. – chux - Reinstate Monica Jun 23 '23 at 19:15
  • 1
    Wow, learn something new every day. I knew `fscanf()` was not great, but I didn't know it was _this_ broken. – pmacfarlane Jun 23 '23 at 19:29
  • 1
    If this is what Fortify is trying to convey, then its messaging is very poor here. I don't disagree about the potential for UB, but I'm more inclined to think that Fortify is just wrong than that this is the nature of its complaint. – John Bollinger Jun 23 '23 at 19:29
  • 4
    I would say there is 0% chance of corrupting memory as a *direct* consquence of the integer overflow, despite the "you'll get nasal demons Har har mantra". The OP suggests there is a way to prevent overflow of the array in the `fscanf` format string: but there isn't. As commented, does the OP get this error flagged from `int n; fscanf(fp, "%d", &n);`? The UB refers to the *value* being undefined, not about the compiler throwing tomatoes out of the pram. Otherwise it would be impossible to write code that survives user input error. – Weather Vane Jun 23 '23 at 19:34
  • @WeatherVane The repo OP linked also has this in it: `fscanf(fp, "%d", &_RHour[i])` and OP did not mention getting a message for that. While I think chux is probably right about possible UB here, I think Fortify has something else in mind. – pmacfarlane Jun 23 '23 at 19:37
  • @Shane Bishop, a practical question: what functionality do you desire if input is outside `int` range? – chux - Reinstate Monica Jun 23 '23 at 19:38
  • @chux if the input may be over the whole `int` range, then `fscanf(fp, "%9d", &_RWHour[i][j])` is preventing input at the top end, but `fscanf(fp, "%10d", &_RWHour[i][j])` would allow overflow. No-win. – Weather Vane Jun 23 '23 at 19:42
  • @WeatherVane True about 9/10, yet I started with 4 as `int` may be 16-bit. Another approach, read into a wider type like `fscanf(fp, "%18jd", &some_intmax)`, test result and then convert to `int`. – chux - Reinstate Monica Jun 23 '23 at 19:48
4

Is it possible for the Fortify finding to be a false positive?

Of course it's possible, in general, that Fortify reports a false positive. And @chux's answer about possible undefined behavior notwithstanding, I think this is indeed best characterized as a false positive. At minimum, it is an inaccurate diagnosis. Formally, a program that exercises undefined behavior can do anything, but in practice, overrunning the bounds of the specified int object is a highly unlikely outcome in this case, and diagnosing that discounts all the other possible results.

Supposing that the other arguments are correctly type matched with respect to the format, the main risk for the scanf family functions to overflow a target object is with a %s or %[ conversion specification without a width, or a %s, %[, or %c conversion specification with too large a width. I think Fortify is inappropriately generalizing that.

Or is it possible to write to memory outside an int when you pass the address of an int to fscanf?

Not without the involvement of undefined behavior, whether from the input being overlong or from some other source outside the fscanf() call.

If it is possible to write outside the memory of an int when using %d with fscanf, how can this be safely avoided?

Realistically, I don't think this is something to be concerned about.

Nevertheless, you can probably satisfy Fortify by adding an appropriate field width to the conversion specifier, though determining what "appropriate" means in this case could be tricky. This will also rule out any possibility of UB arising from the call, which is a worthy objective.

Alternatively, you can parse the number by some means other than fscanf().

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
0

The code does not seem to warrant the problem reported by the tool.

Furthermore, there is another call to fscanf in the same file that should report the same problem except for the fact that the destination int is an entry in a 1D array instead of a 2D array:

381                if (fscanf(fp, "%d", &_RHour[i]) <= 0) {

But in both cases, it is rather obvious and should be inferred by the tool that the index variable stay within the proper boundaries.

You can try and change the loop in lines 404..426 with this code to investigate if the tool is to blame:

       for (j = 0; j <= 24; j++) {
            _CWHour[i][j] = 0;
            _RWHour[i][j] = 0;
            snprintf(_weekly, 128, "%s/%d/%d", STATWQUEUE, i, j);
            if (File_DateofChange(_weekly) >= 0) {
                FILE *fp = fopen(_weekly, "r");
                if (fp != NULL) {
                    int hour;
                    if (fscanf(fp, "%d", &hour) == 1 && hour >= 0) {
                        _RWHour[i][j] = hour;
                    }
                    fclose(fp);
                }
            }
        }
chqrlie
  • 131,814
  • 10
  • 121
  • 189