0

how functions are stored on memory i.e : the data segment as normal variables expressions or on the code segment in either one What is pushed to the stack a pointer? or the whole function and why do we need to push the local variables and arguments too if they are already stored in the data segment ?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Susano
  • 230
  • 2
  • 9
  • Diagram: [Layout of a C program](https://aticleworld.com/memory-layout-of-c-program/) with comments. – ryyker Feb 17 '20 at 12:58
  • Think about (1) a function that calls itself, directly, or indirectly, (2) a function that is called from multiple threads, (3) a function that is called from both user code and interrupt handlers. How can a function work correctly for those cases? – Martin James Feb 17 '20 at 13:14
  • by storing its old registers value (argument and locals) to the stack , i know that just wondered if we just have those value initiated in code (memory) why do we need to push them to the stack in the first place – Susano Feb 17 '20 at 13:30

3 Answers3

2

While the details differ between platforms and executable file formats and calling conventions, it's common for programs to be divided into multiple segments. Here's the general layout of a program for x86 (taken from here):

              +------------------------+
high address  | Command line arguments |   
              | and environment vars   |  
              +------------------------+
              |         stack          |
              | - - - - - - - - - - -  |
              |           |            |
              |           V            |
              |                        |
              |           ^            |
              |           |            |
              | - - - - - - - - - - -  |
              |          heap          |
              +------------------------+
              |    global and read-    |
              |       only data        |
              +------------------------+
              |     program text       |
 low address  |    (machine code)      |
              +------------------------+   

The function itself is stored in the program text segment, while the data the function operates on is stored in other segments. Note that this is the virtual memory layout, not physical memory.

On most systems, when you call a function a stack frame is allocated from the stack. The stack frame will have space for the function arguments and local variables, along with the address of the previous stack frame and the address of the next instruction to execute after the function returns (again, specific details will differ based on calling conventions):

              +----------------+
high address: | argument N     |
              +----------------+
              | argument N-1   |
              +----------------+
                     ...
              +----------------+
              | argument 1     |
              +----------------+
              | return addr    |
              +----------------+
              | prv frame addr | <---- %ebp
              +----------------+
              | local 1        |
              +----------------+
              | local 2        |
              +----------------+
                     ...
              +----------------+
 low address: | local N        | <---- %esp
              +----------------+

Along with the stack pointer register (%esp on x86, %rsp on x86_64), there is a base pointer register (%ebp on x86, %rsp on x86_64). This register stores the address of the stack frame, and the function refers to locals and arguments via offsets from that address. Quick-n-dirty example:

int main( void )
{
  int x = 1;
  int y = 2;

  printf( "foo(1,2) = %d\n", foo( x, y ) );
  return 0;
}

Here's a snippet of the compiled code where we assign x and y (listing obtained with objdump -d on the executable):

 55d:   c7 45 f0 01 00 00 00    movl   $0x1,-0x10(%ebp)
 564:   c7 45 f4 02 00 00 00    movl   $0x2,-0xc(%ebp)

In this code, we're writing the value 1 to the location 16 bytes "below" the address stored in %ebp, and the value 2 to the location 12 bytes "below".

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • what a great explanation ,really but is this code ```55d: c7 45 f0 01 00 00 00 movl $0x1,-0x10(%ebp) 564: c7 45 f4 02 00 00 00 movl $0x2,-0xc(%ebp)``` the alternative for the push command ? – Susano Feb 21 '20 at 16:04
  • @MahmoudSalah: No, that's just assigning values to those data locations - nothing's being pushed at that point. The stack frame was already created before then. I'll update that later tonight showing an example of how the stack frame is actually built. – John Bode Feb 21 '20 at 17:58
  • @JohnBode Can you describe what the arguments in the calling stack frame represent, please? How come we don't see that in the frame of the callee? – olu Dec 16 '22 at 00:02
1

For example, in C/C++ whenever a function is executed, the execution scope is in stack memory i.e. Memory is allocated for that function in stack. It means the local variables within it including arguments are only available inside that stack unless you pass the argument as pass by address/reference instead of pass by value. In case of pass by reference/address, we use pointer in order to access the variable present outside of this function.

Hope it answers your question. If not, please provide more details.

rootkonda
  • 1,700
  • 1
  • 6
  • 11
  • that was helpful a bit but still didn't get the idea of what is pushed to the stack exactly . The function itself or an address(pointer) to where it actually is – Susano Feb 17 '20 at 13:10
  • 1
    Not the function itself. Stack is mainly used to scope the execution of a function i.e. to make sure the variables, arguments etc are confined within a given stack frame. So for example when you make recursive calls then it doesnt mean the code is copied that many times right? it would be very bad. For every recursive call, system creates a new stack frame with the context specific to that call. Recursion is the excellent example for stack. – rootkonda Feb 17 '20 at 13:48
1

Functions are stored in memory. Often a separate segment is used to store the executable code. The memory manager can protect this memory against alteration. With virtual memory, any code not used can be swapped out and will be restored when the code (function) must execute.

Global data is stored in global memory.

Local data is allocated on the stack. These data are called "automatic variables".

When a function is called, the return address is stored on the stack, preceded by the parameters of the function.

So, no, the function itself (the executable code) is not stored on the stack. Only its parameters, local data and the return address.

Note: this simplified description is based on Intel. Other architectures can have different concepts for "stack".

Paul Ogilvie
  • 25,048
  • 4
  • 23
  • 41
  • so local variables actually don't exist on ram until moment of execution and what is pushed to the stack is just an address to where the function is stored in memory. Am i getting it right ? – Susano Feb 17 '20 at 13:07
  • 1
    But the address of the function _to be called_ is not placed on the stack. That address is contained in the `call` assembler instruction, e.g. `call 123456`. But the address where the calling function must be resumed is placed on the stack, the _return address_. – Paul Ogilvie Feb 17 '20 at 13:17