18

Is there a way I can measure how much stack memory a function uses?

This question isn't specific to recursive functions; however I was interested to know how much stack memory a function called recursively would take.

I was interested to optimize the function for stack memory usage; however, without knowing what optimizations the compiler is already making, it's just guess-work if this is making real improvements or not.

To be clear, this is not a question about how to optimize for better stack usage

So is there some reliable way to find out how much stack memory a function uses in C?


Note: Assuming it's not using alloca or variable-length arrays, it should be possible to find this at compile time.

ideasman42
  • 42,413
  • 44
  • 197
  • 320
  • 2
    You can. You need to find document that describes ABI for the platform you are using and language mappings for a given language type. After that you need to dig into your compiler's documentation and find implementation details on organizing stack frames and optimizing out automatic variables. After reading all that stuff you will simply generate assembly output and see how stack pointer is actually used, because otherwise it is tedious and inaccurate... – Valeri Atamaniouk Feb 12 '15 at 22:00
  • 1
    I haven't tried this, but one idea that comes to mind if you want to discover this dynamically, say for a recursive call hierarchy, is to call a function before the one you are interested in, which allocates a very large stack buffer and initializes it to a known pattern, like [0,1,2,3,4,5,6...,0,1,2,3,4,5...] and then call a companion function afterwards, which checks how much of the known pattern is still intact. This would not be accurate down to the byte, of course, but could give a ballpark idea about the stack usage. – 500 - Internal Server Error Feb 12 '15 at 22:02
  • *"simply generate assembly output and see how stack pointer is actually used"* If you're compiling with gcc, you can use the `-S` option to generate an assembly file from your .c file, which you can examine with any text editor. The other option is to use a debugger that shows you the assembly code. That way, you can step through the code and see how the stack pointer and base pointer are used. – user3386109 Feb 12 '15 at 22:05
  • Note, this is GCC spesific - but I was thinking to selectively use `-Wframe-larger-than=###` to find the limit of the stack... The problem with this is I want to apply it to a single function. And it looks like `#pragma GCC diagnostic` doesn't support `-Wframe-larger-than` – ideasman42 Feb 12 '15 at 22:05
  • How about using inline assembly to get the value of `%ebp` inside of your function, and inside of a function that is called by your function? – Aasmund Eldhuset Feb 12 '15 at 22:05
  • 1
    why you want to optimize stack usage? it's weird, since it is not necessary to have stack implementation according c standard. even if there is, how stack is used is totally depends on the compiler and os, – Jason Hu Feb 12 '15 at 22:06
  • You can call your function, registering a pointer to two local variables on the caller's and callee's stack, respectively, then subtracting their value (after conversion to integer in order to avoid UB). – The Paramagnetic Croissant Feb 12 '15 at 22:07
  • @HuStmpHrrr - while stack usage may depend on many factors, I would at least like to know if a change to my code causes significant differences to stack usage. Of course its possible a different configuration would react differently to any change. – ideasman42 Feb 12 '15 at 22:10
  • @ideasman42 if you can get result with `-Wframe-larger-than`, move your function into separate module. – Valeri Atamaniouk Feb 12 '15 at 22:10
  • @Valeri Atamaniouk, while this can work, in practice its quite a hassle and could be tripped up by inline functions in headers too. – ideasman42 Feb 12 '15 at 22:13
  • @ideasman42 so why it's your goal to minimize the stack usage? assuming you are in linux, check `ulimit -s` to show your max stack size, and it will be <= 10MB normally, which is not big at all. however, `ulimit -d` shows the max size of heap, which may be inf. it really doesn't quite make sense to me you try to optimize stack usage, since even if you spend time on it, the payoff is too small to matter. – Jason Hu Feb 12 '15 at 22:15
  • @ideasman42 i've misread the question and thought you need it at build time. at run time you need to fill stack with pattern before calling the function, and check the pattern after. – Valeri Atamaniouk Feb 12 '15 at 22:16
  • @HuStmpHrrr, the system I am running is besides the point, writing C code for embedded systems e.g. is one reason you may want to use stack sparingly. – ideasman42 Feb 12 '15 at 22:18
  • @ideasman42 oh, got it. that's a different story when you talk about embedded system. analyzing it is truly difficult. – Jason Hu Feb 12 '15 at 22:22
  • Well, you could calculate the size roughly. if you know how the stack is constructed, you can calculate the size of your local variables, the parameters and the frame pointer and return pointer. Now you've got that size and you multiply it by calls. It's only a rough estimate though as there might be paddings and saved registers you can't account for. – rfreytag Feb 12 '15 at 22:27
  • @pfannkuchen_gesicht - right, you can make a fairly accurate guess if you simply add-up all the sizes (and account for alignment), but its not so simple to know which variables might be optimized out. – ideasman42 Feb 12 '15 at 22:30
  • 2
    @ideasman42 You could look through your compiler manual for compiler specific features. e.g. if you use gcc, you can have it tell you the stack usage of each of your functions with the `-fstack-usage` flag - you'll have to calculate the usage of the call-graph yourself though (such as if the function is recursive, multiply it with the number of recursions.) – nos Feb 12 '15 at 22:43
  • @nos, while GCC specific this is the best answer so far. Its a bit awkward to use on a single file/function since it means building with different CFLAGS. but I can make some helper utility to run this on any file with the correct includes, defines. – ideasman42 Feb 12 '15 at 23:05

3 Answers3

13

Using warnings

This is GCC specific (tested with gcc 4.9):

Add this above the function:

#pragma GCC diagnostic error "-Wframe-larger-than="

Which reports errors such as:

error: the frame size of 272 bytes is larger than 1 bytes [-Werror=frame-larger-than=]

While a slightly odd way method, you can at least do this quickly while editing the file.

Using CFLAGS

You can add -fstack-usage to your CFLAGS, which then writes out text files along side the object files. See: https://gcc.gnu.org/onlinedocs/gnat_ugn/Static-Stack-Usage-Analysis.html While this works very well, its may be a little inconvenient depending on your buildsystem/configuration - to build a single file with a different CFLAG, though this can of course be automated. – (thanks to @nos's comment)


Note,

It seems most/all of the compiler natural methods rely on guessing - which isn't 100% sure to remain accurate after optimizations, so this at least gives a definitive answer using a free compiler.

ideasman42
  • 42,413
  • 44
  • 197
  • 320
  • i tried to use -fstack-usage flag but i get compiler error. can you provide an example how to use this flag ? – Prabhakaran M Sep 09 '20 at 06:16
  • @Karan2020 please post a link to your reference – vlad_tepesch Sep 16 '20 at 08:03
  • @vlad_tepesch Reference link https://gcc.gnu.org/onlinedocs/gnat_ugn/Static-Stack-Usage-Analysis.html is already posted in the answer. I have passed the option to GCC compiler. For Example: gcc -c file_name.c -fstack-usage . – Prabhakaran M Sep 18 '20 at 15:22
2

You can very easily find out how much stack space is taken by a call to a function which has just one word of local variables in the following way:

static byte* p1;
static byte* p2;
void f1()
{
    byte b;
    p1 = &b;
    f2();
}
void f2()
{
    byte b;
    p2 = &b;
}
void calculate()
{
    f1();
    int stack_space_used = (int)(p2 - p1);
}

(Note: the function declares a local variable which is only a byte, but the compiler will generally allocate an entire machine word for it on the stack.)

So, this will tell you how much stack space is taken by a function call. The more local variables you add to a function, the more stack space it will take. Variables defined in different scopes within the function usually don't complicate things, as the compiler will generally allocate a distinct area on the stack for every local variable without any attempt to optimize based on the fact that some of these variables might never coexist.

Mike Nakis
  • 56,297
  • 11
  • 110
  • 142
  • I was considering to do something like this, but your example is a bit simplistic. In that the function may have loops, multiple vars defined in different branches, call inline functions... its not always as simple as adding a single variable at the end of a block and getting its address, Also, its possible the compiler re-orders variables - http://stackoverflow.com/questions/238441/can-a-c-compiler-rearrange-stack-variables – ideasman42 Feb 12 '15 at 22:41
  • @ideasman42 I amended my answer. I insist that this mechanism is a very good approach, mainly due to its simplicity. – Mike Nakis Feb 12 '15 at 22:48
  • if you have 3+ code branches each with their own nested variables, I dont see how this can work well. Said different, AFAICS this only works well for functions which define all there variables in one block. – ideasman42 Feb 12 '15 at 22:58
  • 1
    No, I repeat, most compilers don't care whether you define them all in one block, or each in its own block. Try it. – Mike Nakis Feb 12 '15 at 22:59
  • 1
    @ddriver branches are completely irrelevant. Most compilers will allocate stack space for locals as if they were all declared in the root scope of the function. Don't believe me? Try it. I posted the code. It is so simple. Try it. – Mike Nakis Feb 12 '15 at 23:02
  • @ideasman42 Yeah, if you have nested branches and variables etc.. all of the that crap has to have space created for it in the preamble, there are things that could make this messy, but inlined functions aren't one of them, that space would also be included... – Grady Player Feb 12 '15 at 23:15
  • It will not be a dynamic number, including the entire call stack, just this function's usage... – Grady Player Feb 12 '15 at 23:17
  • @GradyPlayer If you want a "dynamic number" you have to use the principle I showed in my answer to build something more complex which will be calculating a dynamic number. – Mike Nakis Feb 12 '15 at 23:19
  • It is really simple: think of `f1()` as being your `main()`, and of `f2()` as being a function which you invoke whenever you need an answer to the question "how much stack space have we consumed yet?" – Mike Nakis Feb 12 '15 at 23:21
  • In addition to other problems with this method, already mentioned, this method may have issues if compiler optimizations are turned on, which is common with release builds and will affect stack usage. The `volatile` keyword should be used when declaring `byte b`, so that the compiler won't optimize these variables away, which could result in getting a register address, or some other unusable value. – Jim Fell Feb 12 '15 at 23:50
  • @JimFell No, wrong. The compiler cannot optimize away `b` if you ever take the address of `b`. It might still optimize the value of `b`, but not `b` itself. And we do not care about the value of `b`. – Mike Nakis Feb 12 '15 at 23:51
  • 1
    @MikeNakis Not all compilers optimize equally. – Jim Fell Feb 12 '15 at 23:54
  • @JimFell correct, but they tend to not break our programs with their optimizations. Most of the time. – Mike Nakis Feb 12 '15 at 23:57
  • @MikeNakis **Most** of the time is not **all** of the time. – Jim Fell Feb 13 '15 at 00:07
  • 1
    @JimFell "don't do it this way because the compiler might have a bug" is not a valid argument. – Mike Nakis Feb 13 '15 at 00:08
  • @MikeNakis Validity is in the eye of the beholder...as is is elegance. – Jim Fell Feb 13 '15 at 00:10
  • @Mike Nakis, but what makes you think that `f1` contains a `p1=&b` assignment instruction? Suppose `f1` is an arbitrary function which we are not allowed to modify. How to measure the amount of stack `f1` will use if called? – mercury0114 Nov 17 '20 at 22:05
  • @mercury0114 This is an entirely different question. The OP says he wants to measure the stack usage of his function so as to optimize it, so he clearly has control over his function. – Mike Nakis Nov 18 '20 at 08:29
1

To calculate the stack usage for the current function you can do something like this:

void MyFunc( void );

void *pFnBottom = (void *)MyFunc;
void *pFnTop;
unsigned int uiStackUsage;

void MyFunc( void )
{
    __asm__ ( mov pFnTop, esp );
    uiStackUsage = (unsigned int)(pFnTop - pFnBottom);
}
Jim Fell
  • 13,750
  • 36
  • 127
  • 202
  • Can you also define `pFnBottom` and `pFnTop` **inside** `myFunc`? – étale-cohomology Jul 20 '21 at 21:53
  • 1
    @étale-cohomology Possibly, but that could affect your function's stack usage. Even using the `register` keyword doesn't guarantee that your variables will be stored in registers. The most reliable way is to use the implementation shown with global variables. You could declare them as static to limit their scope. – Jim Fell Jul 20 '21 at 22:23