3

when creating a thread, using pthread_create, the memory address space reported (via top and ps) grows significantly per the info below:

The stack size for the thread is explicity set, so that is fine and I can see it pop up in pmap.

But what I cannot explain is the is the 65404 KB hit? Is this a linux kernel mapping or what exactly?

The detachstate attribute is also set for the thread and even though it finished in <1s, the memory mapping is still present in the pmap.

Is this simply part of linux memory management in general, once mapped and then can be resused? Can the 65M hit be tuned, as this is a single thread case and when multiple threads are created simultaneously, then VSZ reporred ramps up very fast. 10 threads, 650M swell in process address space reported.

...shared libs
...shared libs
2adf40000000 (132 KB)  rw-p (00:00 0)        <--- stack size for the thread.
2adf40021000 (65404 KB)  ---p (00:00 0)      <--- what is this? 
7ffcb8bed000 (128 KB)  rwxp (00:00 0)        [stack]
7ffcb8c0d000 (4 KB)    rw-p (00:00 0)        
7ffcb8dc6000 (8 KB)    r--p (00:00 0)        [vvar]
7ffcb8dc8000 (8 KB)    r-xp (00:00 0)        [vdso]
ffffffffff600000 (4 KB)  r-xp (00:00 0)      [vsyscall]
mapped:   116172 KB writable/private: 1140 KB shared: 0 KB

thank you.

EDIT:

So I added a second thread and pmap now shows:

2adf40000000 (132 KB)  rw-p (00:00 0)        
2adf40021000 (65404 KB)  ---p (00:00 0)      
2adf44000000 (132 KB)  rw-p (00:00 0)        
2adf44021000 (65404 KB)  ---p (00:00 0)      
7ffcb8bed000 (128 KB)  rwxp (00:00 0)        [stack]
7ffcb8c0d000 (4 KB)    rw-p (00:00 0)        
7ffcb8dc6000 (8 KB)    r--p (00:00 0)        [vvar]
7ffcb8dc8000 (8 KB)    r-xp (00:00 0)        [vdso]   
ffffffffff600000 (4 KB)  r-xp (00:00 0)      [vsyscall]
mapped:   181840 KB writable/private: 1400 KB shared: 0 KB

So now there is 2 stacks and the 65M regions. Both of which has pumped up the Virtual Address space reported also.

EDIT: environment: glibc : ldd (Ubuntu EGLIBC 2.19-0ubuntu6.6) 2.19 and kernel is 4.4.103

moore
  • 33
  • 4
  • My educated guess is that it is padding in order to fit to the page size. I however can’t confirm just yet... – Fabien Bouleau Nov 30 '17 at 14:53
  • 1
    Look at the permissions, there's nothing mapped in there, no actual memory is used. It's just reserved address space. If I had to speculate I would guess that this is reserved space to be able to grow the stack in the future. Or just a red zone so that buffer overflows are less likely to accidentally spill into other thread stacks. – Art Nov 30 '17 at 15:01
  • I just added a second thread and put the details in the original post. so the regions are there for both threads now..... – moore Nov 30 '17 at 15:17
  • 1
    @Art: It seems odd that they would be *above* the stack rather than below it if that's the case. – R.. GitHub STOP HELPING ICE Nov 30 '17 at 16:55
  • FWIW `strace` may be a useful tool in determining how these maps come to be. You should also specify the exact kernel and glibc versions you're using (and if you're using a hardened distro that may have patched them), since they're almost certainly relevant to what's happening. – R.. GitHub STOP HELPING ICE Nov 30 '17 at 16:56
  • thanks for for the info. glibc : ldd (Ubuntu EGLIBC 2.19-0ubuntu6.6) 2.19 and kernel is 4.4.103 @R, when you say "hardened distro", can you draw anything from that info? I will play with the strace to see. – moore Nov 30 '17 at 21:31
  • As stack grow downward (lower), it is located above (higher) of the stack. So it should not be reserved for stack. I wonder if it is reserved for TLS (thread local storage). As dynamic libraries may allocate new TLS variables, memory spaces must be reserved for further load libraries. – Zang MingJie Nov 30 '17 at 21:36

1 Answers1

3

Found the answer here, It is per thread arena, mainly used for reduce lock of malloc.

Threading: During early days of linux, dlmalloc was used as the default memory allocator. But later due to ptmalloc2’s threading support, it became the default memory allocator for linux. Threading support helps in improving memory allocator performance and hence application performance. In dlmalloc when two threads call malloc at the same time ONLY one thread can enter the critical section, since freelist data structure is shared among all the available threads. Hence memory allocation takes time in multi threaded applications, resulting in performance degradation. While in ptmalloc2, when two threads call malloc at the same time memory is allocated immediately since each thread maintains a separate heap segment and hence freelist data structures maintaining those heaps are also separate. This act of maintaining separate heap and freelist data structures for each thread is called per thread arena.

Zang MingJie
  • 5,164
  • 1
  • 14
  • 27