This is a fairly open-ended question that doesn't have a right answer. But as someone who has worked on several static analyzers, I'll bite and briefly explain what's hard about this example.
Problem 1: False positives
You and I know that argv[1]
could be an arbitrarily long string. (In part that knowledge is derived from the context of being a Stack Overflow question, in which code is usually presented absent any context or a-prior assumptions.) But if a static analyzer reported every strcpy
where the source string was not known (by the static analyzer!) to have a bounded length, it would in practice report a very large number of what most developers would consider false positives (FPs): incorrect reports that they don't want to see.
That's because there a very large number of cases where strcpy
is used correctly, but the reason it is correct is beyond the reasoning capacity of the analyzer, and might not even be present in the source code at all (e.g., "program X is only ever invoked by program Y, which never passes arguments longer than 80 characters"). In comparison, only a very small fraction of strcpy
calls (with arguments the analyzer can't bound) are wrong. Let's be generous and say 10% of them are wrong--that's still a 90% FP rate if we report them all, far beyond what most developers will tolerate.
When a tool reports too many FPs, most developers quickly stop using them, at which point it's providing no value. Consequently, most static analyses choose to limit what they report to cases where the tool is fairly confident, at least by default.
Problem 2: Interprocedural analysis
Even for a tool that wants to report this (as you say Cppcheck does, when the strcpy
is directly in main
), the second problem is understanding what buffer_overflow
does. It is impractical for a static analyzer to simply "inline" the contents of callees into callers arbitrarily deeply because the resulting AST would be enormous and the number of paths astronomical. Consequently, analyzers generally summarize callee behavior. The exact form of those summaries, and the algorithms that compute them, are the subject of active academic research and closely guarded trade secrets.
The behavior of strcpy
is actually fairly complex compared to what a typical function summary can express. There is a size involved, but that size is derived by examining the contents of the data one of the pointers points at, specifically the location of the first NUL byte. Then that size affects what happens to the things pointed at by both arguments. That's a lot to encode in a summary in a general and scalable way, so most tools don't. As a result, the tool has only a very crude understanding of what buffer_overflow
does, typically too crude to allow it to confidently report a defect here.