1

I'm trying to build an operating system project which gets the current thread's context and serialize, send it over the network to another machine and pick up where the thread left off and continue. (So-called thread-migrator).

I managed to finish it in Linux but when I run in a real environment I always get segmentation fault but it worked under gdb.

It may be due to the address randomization problem because when I turn on the randomization in gdb it shows that there's segmentation fault in setcontext().

But I just don't get it because the text section won't get randomized(and I checked the REG_RIP value in ucontext.mcontext.greps, they're the same each time). Then what could possibly be the reason that the address randomization crashes it? The stack would be set directly by the ucontext, which I don't think the randomization will be a problem.

My server code looks like this.

bool_t migrate_1_svc(rpc_ucontext *context, void *res, struct svc_req *req)
{
  printf("Server received.\n");
  ucontext_t cont;

  // initialize the context
  getcontext(&cont);

  cont.uc_flags = context->uc_flags;

  // ucontext.stack_t
  cont.uc_stack.ss_flags = context->uc_stack.ss_flags;
  cont.uc_stack.ss_sp = context->uc_stack.ss_sp.ss_sp_val;
  cont.uc_stack.ss_size = context->uc_stack.ss_sp.ss_sp_len;

  // ucontext.mcontext_t
  memcpy(cont.uc_mcontext.__reserved1, context->uc_mcontext.__reserved1, sizeof(cont.uc_mcontext.__reserved1));
  cont.uc_mcontext.fpregs = (struct _libc_fpstate *)malloc(sizeof(struct _libc_fpstate));
  memcpy(cont.uc_mcontext.fpregs, &context->uc_mcontext.fpregs, sizeof(struct _libc_fpstate));
  memcpy(cont.uc_mcontext.gregs, context->uc_mcontext.gregs, sizeof(gregset_t));

  memcpy(&cont.uc_sigmask, &context->uc_sigmask, sizeof(__sigset_t));

  memcpy(&cont.__fpregs_mem, &context->__fpregs_mem, sizeof(struct _libc_fpstate));

  printf("Setting the context.\n");

  ucontext_t my_context;
  getcontext(&my_context);

  cont.uc_link = &my_context;

  setcontext(&cont);
  return true;
}
Ryan
  • 373
  • 4
  • 15
  • In gdb, `la pre` and watch where it faults. Compare with the source process. – arsv Nov 16 '17 at 17:38
  • ' thread-migrator'? You are going to copy the entire stack etc. to the other machine? IIRC, context has only address/size info? – Martin James Nov 16 '17 at 18:43
  • @MartinJames Yes and since it is pthread-based, I can set the stack to a heap-allocated memory space for the thread using `pthread_attr_setstack()`. And I thought maybe it's the RSP and RBP registers problem and tried to change that as well but it still gives me segfault. Still trying... – Ryan Nov 16 '17 at 18:45
  • You are very 'brave and courageous'. What about file handles, locks, heap allocations etc. held by the thread? – Martin James Nov 16 '17 at 19:20
  • @MartinJames Yeah I'd say so. Actually it's a course project so I just have to implement a really naive and simple one. File handles are not considered and heap allocations are taken care of by a distributed memory system which is also part of the project. – Ryan Nov 16 '17 at 19:22
  • After you call setcontext, what instruction gets the segfault? If ASLR is enabled, the executable's text segment may or may not be loaded in a random location (depending on compiler options like `pie`), but things like the text (and data) segments of libc likely *will* be loaded at random locations. There may be some saved instruction pointer or data segment address on the stack that's only valid for the first system's libc. – Mark Plotnick Nov 16 '17 at 22:15
  • when calling any of the heap allocation functions (malloc, calloc, realloc) 1) always check (!=NULL) the returned value to assure the operation was successful. 2) the returned type is `void*` so can be assigned to any pointer. Casting just clutters the code, making it more difficult to understand, debug, etc. – user3629249 Nov 18 '17 at 18:04

0 Answers0