4

Let's say I have some legacy code which was written using unsafe calls to C STL functions like strcpy. We all know that strcpy is unsafe because it leaves a program vulnerable to buffer overflow issues. Let's say I want to replace all calls to strcpy with calls to strncpy. A technique for replacing all calls to strcpy(dest, src) would involve calling strncpy with the parameters (dest, src, length of dest - 1) and then terminating dest with \0. I know the issue with this is that we don't always know the length of dest because it may be a pointer to memory allocated on the heap.

Let's assume I can figure out the length of dest at each of these call sites. I could replace all calls to strcpy with calls to strncpy which will guarantee that my program is immune to buffer overflow attacks (at least from improper use of strcpy). However, this approach may silently truncate data and alter program behavior in an undesirable way. Is this a better approach than detecting truncation and aborting the program? Or is it better to allow the truncation but also log it?

I'm asking from the perspective of someone who is interested in developing an automated method for patching legacy code. Does anyone have any thoughts on how to best approach this issue?

RouteMapper
  • 2,484
  • 1
  • 26
  • 45
  • strncpy() is a lot worse. Look at your CRT implementation for alternatives. Or just write your own. It is not a simple substitution, you do need to know the buffer size and that may well require altering function signatures so the buffer size is passed as an argument. – Hans Passant Nov 14 '13 at 17:42
  • I wouldn't be altering the source code to implement these changes. I would be modifying the program in an intermediate language which allows access to these buffer sizes. I would like the return type of these calls to remain the same, so the CRT alternatives will not suffice. – RouteMapper Nov 14 '13 at 17:45
  • I can't imagine that silently truncating data is ever a good thing. I agree with @HansPassant that implementing your own strncpy may be the proper path. You can then put (at minimum) logging when truncation occurs right inside that function. – Jud Nov 14 '13 at 17:45
  • I was thinking of logging rather than aborting because such behavior is equivalent to catching an `ArrayIndexOutOfBoundsException` in Java and logging the exception somewhere. Aborting a program seems a bit over-the-top. I was just wondering what other people think of the logging approach. – RouteMapper Nov 14 '13 at 17:47
  • If the program previously had undefined behavior due to a buffer overrun then aborting (with a suitable message) seems entirely reasonable to me. There is no particular reason to expect that code which was expecting the string to fit in the buffer, will behave correctly when the string is truncated to the buffer size. So in the absence of specific information that truncation is desired (which you don't have if you're replacing a call to `strcpy` without reviewing the context it occurs in), you should not truncate. – Steve Jessop Nov 14 '13 at 18:01
  • @SteveJessop - Would such a change to a program reasonably qualify as retrofitting a security feature onto an existing codebase? I think it does because it is preventing a program from performing in an insecure operation due to buffer overflow by making sure it doesn't perform at all. It follows that some human intervention could examine the aborting condition more closely and fix the root of the problem later. Just curious what your thoughts are :) – RouteMapper Nov 14 '13 at 18:09
  • @RouteMapper: my thoughts are that I would abort. It's better than replacing one undefined behaviour with another, which is what will happen if the dodgy program doesn't handle truncation. Basically, no "security feature" should put a program into a state that its author was not expecting and would never have encountered during development and testing (truncated data). Killing dodgy programs, on the other hand, sounds like a security feature to me :-) – Steve Jessop Nov 14 '13 at 18:11
  • @SteveJessop - Got it, but I'm wondering if I can make the argument that aborting still qualifies as a patch. Like I said before, this is a research topic dealing with retrofitting security features, so I need to make sure that everything I suggest qualifies as a security feature people may actually be interested in having. – RouteMapper Nov 14 '13 at 18:13
  • @RouteMapper: then I suppose it depends who it is that you have to persuade that this is an improvement over letting the code continue after a buffer overrun. I think it obviously is, but I don't really know what the qualifications are for something to "be a patch". But it's not so much that aborting is good behaviour, it's that letting an incorrect program run is worse :-) Of course, if you find a way to run the whole program in a sandbox that somehow ensures it cannot do anything harmful, even if data is unexpectedly truncated, then letting it run would be "safe". I'm assuming you haven't. – Steve Jessop Nov 14 '13 at 18:15
  • @SteveJessop - As this is just for a Master's thesis, I suppose it would suffice to just explain the pros/cons of logging vs. aborting. I don't have to make a religious devotion to either approach; my main point is showing that there are automated approaches for function replacement. I'm not selling this to anyone or guaranteeing it is a magic fix-all. It's the approach that I'm selling. – RouteMapper Nov 14 '13 at 18:17
  • 1
    @RouteMapper: OK, in that case maybe the way to present it is that you have an alternative implementation of `strcpy` that (a) is a correct implementation per the standard, hence aside from performance it does not affect any program that calls `strcpy` correctly (although for this to be true you must *not* just call `strncpy` because of the zero-fill behaviour of `strncpy`) (b) prevents buffer overrun, (c) replaces the buffer overrun UB with some other behaviour, which (whether it's to abort or to log) aids in diagnosing the problem. – Steve Jessop Nov 14 '13 at 18:20

2 Answers2

6

We all know that strcpy is unsafe because it leaves a program vulnerable to buffer overflow issues.

This is not a fault of strcpy in the slightest: it is up to the programmers to ensure that the string is going to fit into their buffer, for example, by calling strlen before copying, or ensuring that the string that comes in cannot possibly be longer than their buffer.

Let's say I want to replace all calls to strcpy with calls to strncpy

You shouldn't do that, unless you are working with fixed-size strings: remember that strncpy not only copies up to the terminating null, but also fills the rest of the string with null bytes. If you are looking for a "modern replacement" of strcpy, consider using strlcpy instead.

This approach may silently truncate data and alter program behavior in an undesirable way. Is this a better approach than detecting truncation and aborting the program? Or is it better to allow the truncation but also log it?

This is entirely up to you. It depends on the place in your design where such truncation would happen: if it happens in the code that sends authentication information to a web service, you would be better off stopping the process right then; if it happens in the code that writes a trace message into a log, it is probably OK to ignore the issue, or log it and go on. Sadly, you cannot decide it automatically, because a certain level of program understanding is required.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • I understand `strcpy` does exactly what it promises and is not "at fault" for anything a programmer does. Like a loaded gun in the hand of an inexperienced operator, it is considered "unsafe." The reason I'm asking about this is because I'm researching ways to retrofit security features into existing codebases. I suppose from a research standpoint, I could just explain the pros/cons of logging vs. aborting. I'm not tied to choosing a particular approach. I just want to make sure I hear other opinions on the issue and deliver appropriate consideration to all sides. – RouteMapper Nov 14 '13 at 17:51
  • This statement "strncpy not only copies up to the terminating null, but also fills the rest of the string with null bytes " is wrong. From the C standard: "characters that follow a null character are not copied" – Vlad from Moscow Nov 14 '13 at 17:55
  • @RouteMapper Continuing with the gun analogy, you have at least two ways to fix the problem: train the operator, or take away the gun. – Sergey Kalinichenko Nov 14 '13 at 17:55
  • 1
    @VladfromMoscow Of course they are not copied! That's because null characters are written in their place ([proof link](http://pubs.opengroup.org/onlinepubs/7990989775/xsh/strncpy.html)). – Sergey Kalinichenko Nov 14 '13 at 17:56
  • 1
    @VladfromMoscow "If the end of the source C string (which is signaled by a null-character) is found before num characters have been copied, destination is padded with zeros until a total of num characters have been written to it." – RouteMapper Nov 14 '13 at 17:56
  • From a research perspective, it might then suffice to say that there are automated approaches available to secure programs against buffer overflows from improper use of `strcpy` and other such unsafe functions, but some program understanding is required in order to ensure that no undesired behavior is introduced within the program. Does this sound reasonable? – RouteMapper Nov 14 '13 at 18:04
  • @RouteMapper You cannot automate the "securing" part without automating the "understanding" part as well. That's the biggest problem. For example, if your automated code fixer is looking at an isolated function that takes a pointer to destination buffer, but does not take the size of the buffer as well, fixing this would require finding all call sites of the function in question, and adding code to pass the size of the buffer; this may require walking up the call chain until you see an allocation - not an easy task. – Sergey Kalinichenko Nov 14 '13 at 18:11
  • @dasblinkenlight - Good point! I have an approach for solving this problem already, though. That's why I said in my OP that we can assume I know the size of every buffer :) – RouteMapper Nov 14 '13 at 18:15
1

strncpy is not an STL function.:) C has no STL library. I do not think it is a good idea to replace all occurences of strcpy to strncpy. The approach should be different depending on a situation. In most cases is is enough to use strcpy.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335