1

64-bit Linux stack smashing tutorial: Part 1 uses Get environment variable address gist to get environment variable address. The prerequisite is to first disable ASLR via echo 0 > /proc/sys/kernel/randomize_va_space.

The content of the gist is:

/*
 * I'm not the author of this code, and I'm not sure who is.
 * There are several variants floating around on the Internet, 
 * but this is the one I use. 
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
    char *ptr;

    if(argc < 3) {
        printf("Usage: %s <environment variable> <target program name>\n", argv[0]);
        exit(0);
    }
    ptr = getenv(argv[1]); /* get env var location */
    ptr += (strlen(argv[0]) - strlen(argv[2]))*2; /* adjust for program name */
    printf("%s will be at %p\n", argv[1], ptr);
}

Why *2 is used to adjust for program name?

My guess is that the program name is saved twice above the stack.

enter image description here

The following diagram from https://lwn.net/Articles/631631/ gives more details:

------------------------------------------------------------- 0x7fff6c845000
 0x7fff6c844ff8: 0x0000000000000000
        _  4fec: './stackdump\0'                      <------+
  env  /   4fe2: 'ENVVAR2=2\0'                               |    <----+
       \_  4fd8: 'ENVVAR1=1\0'                               |   <---+ |
       /   4fd4: 'two\0'                                     |       | |     <----+
 args |    4fd0: 'one\0'                                     |       | |    <---+ |
       \_  4fcb: 'zero\0'                                    |       | |   <--+ | |
           3020: random gap padded to 16B boundary           |       | |      | | |

In this diagram, ./stackdump is used to execute the program. So I can see that the program name ./stackdump is saved once above environment strings. And if ./stackdump is launched from Bash shell, Bashell will save it in environment strings with key _:

_

(An underscore.) At shell startup, set to the absolute pathname used to invoke the shell or shell script being executed as passed in the environment or argument list. Subsequently, expands to the last argument to the previous command, after expansion. Also set to the full pathname used to invoke each command executed and placed in the environment exported to that command. When checking mail, this parameter holds the name of the mail file.

Environment strings are above the stack. So the program name is saved another time above the stack.

txk2048
  • 281
  • 3
  • 15
Jingguo Yao
  • 7,320
  • 6
  • 50
  • 63
  • 1
    What exactly are you asking? The code works because getenv gets the address of an environment variable, and the call to your program takes up space on the stack as well, so you adjust the pointer accordingly. It is in the comments of the code. – Jacob H Nov 08 '16 at 14:19
  • To my knowledge, there is usually about 2 bytes per character in the program name allocated on the stack. The first place I saw this piece of code was in *Hacking: The Art of Exploitation* by Jon Erickson. I suggest reading more there, or researching the linux kernel to understand how the stack looks in memory. – Jacob H Nov 08 '16 at 14:34
  • @JacobH yes, the code originates from Page 147 and 148 of *Hacking: The Art of Exploitation, 2nd Edition* by Jon Erickson. But the book does not explain why it works. – Jingguo Yao Nov 08 '16 at 15:08
  • 1
    It's basically because the program name is stored twice, once at the very top of the stack, and again as argv[0]. (Of course, argv[0] might not be the program name, depending on how the program was invoked, which is why the program name needs to be on the stack separately.) See, for example, https://lwn.net/Articles/631631/ – rici Nov 08 '16 at 16:02

2 Answers2

1

In case anyone is still wondering why. This is because the program name is also stored in an environment variable name "_" beside being pushed on the stack before all the environment variables.

You can check this by attaching gdb to the process and examine the stack contents below the last environment variables. Suppose 0x7fffffffabcd is the address of the last environment variable:

$ gdb -p <pid>

(gdb) x/20s 0x7fffffffabcd

The program name stored in argv[0] doesn't affect the environment variable's addresses because it is placed on top of the last environment variable on the stack.

neiht
  • 313
  • 2
  • 5
0

Save the following code as stackdump.c:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/auxv.h>

int main(int argc, char *argv[]) {
  char *ptr;
  int i;

  for (i = 0; i < argc; i++) {
    printf("  argv[%d]: %p, %p, %s\n", i, argv + i, argv[i], argv[i]);
  }

  char * program = (char *)getauxval(AT_EXECFN);
  printf("AT_EXECFN:               , %p, %s\n", program, program);
  char* path = getenv("PATH");
  printf("     PATH:               , %p, %s\n", path, path);
  char* underscore = getenv("_");
  printf("        _:               , %p, %s\n", underscore, underscore);
}

First, running gcc -o stackdump stackdump.c to compile the code. Second, execute echo 0 > proc/sys/kernel/randomize_va_space. Third, running ./stackdump zero one two to give:

  argv[0]: 0x7fffffffe4a8, 0x7fffffffe6e5, ./stackdump
  argv[1]: 0x7fffffffe4b0, 0x7fffffffe6f1, zero
  argv[2]: 0x7fffffffe4b8, 0x7fffffffe6f6, one
  argv[3]: 0x7fffffffe4c0, 0x7fffffffe6fa, two
AT_EXECFN:               , 0x7fffffffefec, ./stackdump
     PATH:               , 0x7fffffffee89, /usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/cloud-user/.local/bin:/home/cloud-user/bin
        _:               , 0x7fffffffefe0, ./stackdump

Three copies of ./stackdump are in the program's address space as shown above. Two of them have a higher address than PATH's as shown below:

AT_EXECFN: 0x7fffffffefec, ./stackdump
        _: 0x7fffffffefe0, ./stackdump
     PATH: 0x7fffffffee89, /usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/cloud-user/.local/bin:/home/cloud-user/bin

So the reason for *2 is the _ environment variable and AT_EXECFN auxiliary vector value.

Jingguo Yao
  • 7,320
  • 6
  • 50
  • 63