0

Why I can easily detect this trivial case of out of bounds array accessing (marked as OK), using clang-tidy, but I can't detect the one marked as KO?

I understand pointer decay in C, but I would suppose that if I'm passing the array as a fixed size type, it could be considered inside the function.

clang-tidy -checks=clang-analyzer main.c -- -std=c99
/* main.c code snipet */

#define CAPACITY 3

typedef float f32_t;

typedef f32_t f32v_t [ CAPACITY ];

static void out_of_bounds_not_detected ( f32v_t v );

int main ( void )
{
  f32v_t vec01 = { 0.0f, 0.0f, 0.0f };
  f32_t out_of_bounds = vec01 [ CAPACITY ]; /* detected OK */

  out_of_bounds_not_detected ( vec01 );

  return (int) out_of_bounds;
}

static void out_of_bounds_not_detected ( f32v_t v )
{
  v [ CAPACITY ] = 0.0f; /* non-detected KO */
}

I would expect to clang-tidy emit a warning like this inside the function out_of_bounds_not_detected:

warning: array index 3 is past the end of the array
Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Sam
  • 27
  • 1
  • 8
  • 2
    `typedef float f32_t;` oh, dear – Support Ukraine Jul 08 '23 at 10:41
  • 2
    `typedef f32_t f32v_t [ CAPACITY ];` oh, dear, oh dear – Support Ukraine Jul 08 '23 at 10:41
  • "I would expect to clang-tidy emit a warning...." Why do you expect that? What makes you think that clang-tidy can detect every possible error in C code? – Support Ukraine Jul 08 '23 at 10:44
  • Off-topic: @SupportUkraine do you see a problem with this type definitions? – Sam Jul 08 '23 at 15:44
  • @Sam Those `typedef` statements are beyond worthless - they hide standard types for no purpose. All they do is make the code harder to understand. – Andrew Henle Jul 08 '23 at 15:49
  • @AndrewHenle I agree they are an entry barrier. I shouldn't use them here. – Sam Jul 08 '23 at 15:56
  • @Sam You shouldn't use them at all. First, there is no guarantee in C that a `float` is 32 bits. Also, all identifiers that end in `_t` are reserved by POSIX, so there's a potential conflict there, too. Hiding an array behind a `typedef` is even worse - arrays are fundamentally different from scalars, and the `typedef` hides that. – Andrew Henle Jul 08 '23 at 16:07
  • Sam, I guess @AndrewHenle have already answered... but yes, I do see a problem with those defines. They make your code harder to read and understand. Don't make defines for basic types. – Support Ukraine Jul 09 '23 at 07:18

2 Answers2

1

As noted in the answer by alagner, the C language standard (e.g., C99 6.7.5.3p7) specifies that a parameter specified with array type is "adjusted" to a pointer type. However, the standard only requires that the argument supply "at least as many elements as specified by the size expression" (emphasis mine).

Consequently, the code:

void not_detected(int arr[3])
{
  arr[3] = 0;
}

does not necessarily violate the language requirements, since a caller could pass an array with four elements. That said, I think it would still be nice for a static analysis tool to report this case, since doing that deliberately would be very poor form at best.

A clear violation would be something like:

void what_about(void)
{
  int arr[2];
  not_detected(arr);   // same 'not_detected' as above
}

but clang-tidy also does not report this case.

The reason is that clang does not record the array parameter size information in its AST, so clang-tidy cannot see it. To observe this, we can dump the AST like so:

$ cat test.c
// test.c
// https://stackoverflow.com/questions/76642174/out-of-bounds-check-in-c-with-clang-tidy

void detected(void)
{
  int arr[3];
  arr[3] = 0;
}

void not_detected(int arr[3])
{
  arr[3] = 0;
}

void what_about(void)
{
  int arr[2];
  not_detected(arr);
}

// EOF

$ clang -Xclang -ast-dump -fsyntax-only test.c
test.c:7:3: warning: array index 3 is past the end of the array (that has type 'int[3]') [-Warray-bounds]
  arr[3] = 0;
  ^   ~
test.c:6:3: note: array 'arr' declared here
  int arr[3];
  ^
TranslationUnitDecl 0x555ab1e132b8 <<invalid sloc>> <invalid sloc>
|-TypedefDecl 0x555ab1e13ae0 <<invalid sloc>> <invalid sloc> implicit __int128_t '__int128'
| `-BuiltinType 0x555ab1e13880 '__int128'
|-TypedefDecl 0x555ab1e13b50 <<invalid sloc>> <invalid sloc> implicit __uint128_t 'unsigned __int128'
| `-BuiltinType 0x555ab1e138a0 'unsigned __int128'
|-TypedefDecl 0x555ab1e13e58 <<invalid sloc>> <invalid sloc> implicit __NSConstantString 'struct __NSConstantString_tag'
| `-RecordType 0x555ab1e13c30 'struct __NSConstantString_tag'
|   `-Record 0x555ab1e13ba8 '__NSConstantString_tag'
|-TypedefDecl 0x555ab1e13ef0 <<invalid sloc>> <invalid sloc> implicit __builtin_ms_va_list 'char *'
| `-PointerType 0x555ab1e13eb0 'char *'
|   `-BuiltinType 0x555ab1e13360 'char'
|-TypedefDecl 0x555ab1e141e8 <<invalid sloc>> <invalid sloc> implicit __builtin_va_list 'struct __va_list_tag[1]'
| `-ConstantArrayType 0x555ab1e14190 'struct __va_list_tag[1]' 1 
|   `-RecordType 0x555ab1e13fd0 'struct __va_list_tag'
|     `-Record 0x555ab1e13f48 '__va_list_tag'
|-FunctionDecl 0x555ab1e6f668 <test.c:4:1, line:8:1> line:4:6 detected 'void (void)'
| `-CompoundStmt 0x555ab1e6f960 <line:5:1, line:8:1>
|   |-DeclStmt 0x555ab1e6f868 <line:6:3, col:13>
|   | `-VarDecl 0x555ab1e6f800 <col:3, col:12> col:7 used arr 'int[3]'
|   `-BinaryOperator 0x555ab1e6f940 <line:7:3, col:12> 'int' '='
|     |-ArraySubscriptExpr 0x555ab1e6f900 <col:3, col:8> 'int' lvalue
|     | |-ImplicitCastExpr 0x555ab1e6f8e8 <col:3> 'int *' <ArrayToPointerDecay>
|     | | `-DeclRefExpr 0x555ab1e6f880 <col:3> 'int[3]' lvalue Var 0x555ab1e6f800 'arr' 'int[3]'
|     | `-IntegerLiteral 0x555ab1e6f8a0 <col:7> 'int' 3
|     `-IntegerLiteral 0x555ab1e6f920 <col:12> 'int' 0
|-FunctionDecl 0x555ab1e6ff08 <line:10:1, line:13:1> line:10:6 used not_detected 'void (int *)'
| |-ParmVarDecl 0x555ab1e6fe10 <col:19, col:28> col:23 used arr 'int *':'int *'
| `-CompoundStmt 0x555ab1e70070 <line:11:1, line:13:1>
|   `-BinaryOperator 0x555ab1e70050 <line:12:3, col:12> 'int' '='
|     |-ArraySubscriptExpr 0x555ab1e70010 <col:3, col:8> 'int' lvalue
|     | |-ImplicitCastExpr 0x555ab1e6fff8 <col:3> 'int *':'int *' <LValueToRValue>
|     | | `-DeclRefExpr 0x555ab1e6ffb8 <col:3> 'int *':'int *' lvalue ParmVar 0x555ab1e6fe10 'arr' 'int *':'int *'
|     | `-IntegerLiteral 0x555ab1e6ffd8 <col:7> 'int' 3
|     `-IntegerLiteral 0x555ab1e70030 <col:12> 'int' 0
`-FunctionDecl 0x555ab1e70120 <line:15:1, line:19:1> line:15:6 what_about 'void (void)'
  `-CompoundStmt 0x555ab1e703e0 <line:16:1, line:19:1>
    |-DeclStmt 0x555ab1e702d8 <line:17:3, col:13>
    | `-VarDecl 0x555ab1e70270 <col:3, col:12> col:7 used arr 'int[2]'
    `-CallExpr 0x555ab1e703a0 <line:18:3, col:19> 'void'
      |-ImplicitCastExpr 0x555ab1e70388 <col:3> 'void (*)(int *)' <FunctionToPointerDecay>
      | `-DeclRefExpr 0x555ab1e702f0 <col:3> 'void (int *)' Function 0x555ab1e6ff08 'not_detected' 'void (int *)'
      `-ImplicitCastExpr 0x555ab1e703c8 <col:16> 'int *' <ArrayToPointerDecay>
        `-DeclRefExpr 0x555ab1e70310 <col:16> 'int[2]' lvalue Var 0x555ab1e70270 'arr' 'int[2]'
1 warning generated.

The key line of the dump is:

|-FunctionDecl 0x555ab1e6ff08 <line:10:1, line:13:1> line:10:6 used not_detected 'void (int *)'

which shows the recorded function type as void (int *). This is all clang-tidy can see, so it cannot report either case.

(Well, in this case it can see the definitions of what_about and not_detected in the same file, which together comprise a clear violation, but evidently its interprocedural analysis is insufficient to notice the problem that way, which is a little disappointing.)

Out of curiosity, and to confirm it isn't stored but omitted in the dump, I decided to look for where in clang this information is dropped. The parser itself, in Parser::ParseBracketDeclarator(), keeps the array and its size. But the semantic analyzer, in Sema::CheckParameter(), adjusts the type (changing the array to a pointer), retaining only the adjusted type in the resulting ParmVarDecl AST object:

  ParmVarDecl *New = ParmVarDecl::Create(Context, DC, StartLoc, NameLoc, Name,
                                         Context.getAdjustedParameterType(T),
                                         TSInfo, SC, nullptr);

The variable T holds the type as computed by the parser (an array, for the case of an array parameter), but getAdjustedParameterType() returns a pointer type, and the original T and its associated size are not retained anywhere.

Consequently, for clang-tidy to report the original case, some changes would be required to the clang parser.

Scott McPeak
  • 8,803
  • 2
  • 40
  • 79
0

From the standard:

A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to type’’, where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation. If the keyword static also appears within the [ and ] of the array type derivation, then for each call to the function, the value of the corresponding actual argument shall provide access to the first element of an array with at least as many elements as specified by the size expression.

void f(int x[]); and void f(int* x); are totally the same from language standard's standpoint. Therefore the function taking array is just taking the pointer. Why would a linter care about pointer bounds, if technically there are none?

alagner
  • 3,448
  • 1
  • 13
  • 25
  • I understand your point, but rather I would expect a different behavior if I'm doing ```void f(int x[3])```. Is this effectively the same as ```void f(int x[])``` in C standard? – Sam Jul 08 '23 at 09:51
  • 2
    This answer seems to miss the point. A static analyser is not limited by type system laid out in the standard to detect errors like this. I’ve myself been involved in the creation of tools that can detect errors like this. – Cubic Jul 08 '23 at 10:05
  • This is effectively the same as `void f(int* x);`. Or to be clearer: `void f(int* x);` `void f(int x[3]);` and `void f(int x[]);` are the same. – alagner Jul 08 '23 at 10:05
  • @Cubic it's not limited, true. My assumption is that due to the C is treats pointer and array arguments such check would be likely to generate false positives. – alagner Jul 08 '23 at 10:08