1

As the title suggests, I would like to ask if there is any way for me to map the data segment of my executable to another memory so that any changes to the second are updated instantly on the first. One initial thought I had was to use mmap, but unfortunately mmap requires a file descriptor and I do not know of a way to somehow open a file descriptor on my running processes memory. I tried to use shmget/shmat in order to create a shared memory object on the process data segment (&__data_start) but again I failed ( even though that might have been a mistake on my end as I am unfamiliar with the shm API). A similar question I found is this: Linux mapping virtual memory range to existing virtual memory range? , but the replies are not helpful.. Any thoughts are welcome.

Thank you in advance.

Some pseudocode would look like this:

extern char __data_start, _end;

char test = 'A';

int main(int argc, char *argv[]){
  size_t size = &_end - &__data_start;
  char *mirror = malloc(size);
  magic_map(&__data_start, mirror, size); //this is the part I need.
  printf("%c\n", test) // prints A

  int offset = &test - &__data_start;
  *(mirror + offset) = 'B';
  printf("%c\n", test) // prints B
  free(mirror);
  return 0;
}


Gecal
  • 31
  • 3
  • Pretty sure you cannot do this in Linux (assuming you are referring to Linux). – Marco Bonelli Oct 11 '20 at 19:55
  • I am indeed referring to linux. Thank you for your answer. – Gecal Oct 11 '20 at 20:02
  • I thought of trying to mmap `/proc/self/mem` but it seems that this is not supported. – Nate Eldredge Oct 11 '20 at 20:11
  • Hi Nate, i tried so as well but it didnt work. I could open the mem "file" but i couldnt mmap it. That would have been a great hack-ish way of solving my issue. – Gecal Oct 11 '20 at 20:36
  • Overall, this seems like an [XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem); whatever it is you're actually trying to accomplish by wanting to do this, there's almost certainly a better way to do it. – Nate Eldredge Oct 12 '20 at 14:30
  • In any kind of "mirroring" situation, I'd prepare for lots of subtle bugs if you ever try to use the original and mirrored regions together in C code. Compilers often need to know at runtime whether two pointers alias, and they test this by seeing whether the virtual address regions overlap. Mirroring will break this. As a simple example, even though `memmove` is supposed to correctly handle overlapping src and dest regions, it will fail (half the time) if one is the original and the other is the mirror. – Nate Eldredge Oct 12 '20 at 15:12

1 Answers1

0

it appears I managed to solve this. To be honest I don't know if it will cause problems in the future and what side effects this might have, but this is it (If any issues arise I will try to log them here for future references).

Solution:

Basically what I did was use the mmap flags MAP_ANONYMOUS and MAP_FIXED.

  • MAP_ANONYMOUS: With this flag a file descriptor is no longer required (hence the -1 in the call)
  • MAP_FIXED: With this flag the addr argument is no longer a hint, but it will put the mapping on the address you specify.
  • MAP_SHARED: With this you have the shared mapping so that any changes are visible to the original mapping.

I have left in a comment the munmap function. This is because if unmap executes we free the data_segment (pointed to by &__data_start) and as a result the global and static variables are corrupted. When at_exit function is called after main returns the program will crash with a segmentation fault. (Because it tries to double free the data segment)

Code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>

#define _GNU_SOURCE 1
#include <unistd.h>
#include <sys/mman.h>

extern char __data_start;
extern char _end;

int test = 10;

int main(int argc, char *argv[])
{

    size_t size = 4096;
        char *shared = mmap(&__data_start, 4096, PROT_READ | PROT_WRITE, MAP_FIXED | MAP_ANONYMOUS | MAP_SHARED, -1, 0);
        if(shared == (void *)-1){
                printf("Cant mmap\n");
                exit(-1);
        }
    printf("original: %p, shared: %p\n",&__data_start, shared);
    size_t offset = (void *)&test - (void *)&__data_start;
        *(shared+offset) = 50;
        msync(shared, 4096, MS_SYNC);
        printf("test: %d :: %d\n", test, *(shared+offset));
        test = 25;
        printf("test: %d :: %d\n", test, *(shared+offset));
        //munmap(shared, 4096);
}

Output:

original: 0x55c4066eb000, shared: 0x55c4066eb000
test: 50 :: 50
test: 25 :: 25
Gecal
  • 31
  • 3
  • 1
    This mmap won't preserve the original contents of that page of memory; rather, the new mapping will be filled with zeros. So it will have the effect of `memset(__data_start, 0, 4096)`. There may very well be important data on that page (perhaps used by the standard library, if not by you) so this seems like a very bad idea. – Nate Eldredge Oct 12 '20 at 14:14
  • Additionally, note that `shared` and `original` are the same. You haven't made a second mapping at all. – Nate Eldredge Oct 12 '20 at 14:15
  • You are right. Wouldn't a simple copy before the mmap to preserve the old contents, and another copy after the mmap solve this ? – Gecal Oct 12 '20 at 14:17
  • You'd have to be very careful that nothing accesses that page asynchronously while its contents are wrong (signals, helper threads, etc). And you can't rule out the possibility that the `mmap` library function would access that page after returning from the system call. It *might* work if you write the system call and copy routine in assembly, making sure to block all signals beforehand. But it still seems pretty fragile. – Nate Eldredge Oct 12 '20 at 14:21
  • Also, just as a note, you'll want to make sure to use `volatile` carefully in your tests and your real code if/when you get there. The compiler has no knowledge that writing to one mapping might cause data in the other one to change, so it might cache that data in registers. You'll need to use `volatile` to tell it to reread from memory every time. This will of course be terrible for optimization and performance - that's the price you'll pay. – Nate Eldredge Oct 12 '20 at 14:27
  • Thank you very much for all the helpful comments and tips. There are lots in there that I haven't even dreamed of. Maybe indeed this is not the correct way to do what I am trying to. I will have to carefully consider the pros/cons and all of the points you have made. – Gecal Oct 12 '20 at 15:47