Questions tagged [tcmalloc]

TCMalloc is a malloc library developed by Google. It is faster than the glibc 2.3 malloc (ptmalloc2), which takes approximately 300ns to execute a malloc/free pair on a 2.8GHz P4 (for small objects). TCMalloc takes approximately 50ns for the same operation pair. It also reduces lock contention for multi-threaded programs. For small objects, there is virtually zero contention. Another benefit is space-efficient representation of small objects.

Introduction

TCMalloc (Thread-Caching malloc) is a (memory allocation) library developed by Google. It is part of the gperftools (Google Performance Tools) project. Other tools in the same project include a heap checker (detecting memory leaks), a heap profiler (getting statistics for memory usage) and a CPU profiler (getting statistics for CPU usage).

Official Introduction by Sanjay Ghemawat

TCMalloc is faster than the glibc 2.3 malloc (available as a separate library called ptmalloc2) and other mallocs that I have tested. ptmalloc2 takes approximately 300 nanoseconds to execute a malloc/free pair on a 2.8 GHz P4 (for small objects). The TCMalloc implementation takes approximately 50 nanoseconds for the same operation pair. Speed is important for a malloc implementation because if malloc is not fast enough, application writers are inclined to write their own custom free lists on top of malloc. This can lead to extra complexity, and more memory usage unless the application writer is very careful to appropriately size the free lists and scavenge idle objects out of the free list.

TCMalloc also reduces lock contention for multi-threaded programs. For small objects, there is virtually zero contention. For large objects, TCMalloc tries to use fine grained and efficient spinlocks. ptmalloc2 also reduces lock contention by using per-thread arenas but there is a big problem with ptmalloc2's use of per-thread arenas. In ptmalloc2 memory can never move from one arena to another. This can lead to huge amounts of wasted space. For example, in one Google application, the first phase would allocate approximately 300MB of memory for its URL canonicalization data structures. When the first phase finished, a second phase would be started in the same address space. If this second phase was assigned a different arena than the one used by the first phase, this phase would not reuse any of the memory left after the first phase and would add another 300MB to the address space. Similar memory blowup problems were also noticed in other applications.

Another benefit of TCMalloc is space-efficient representation of small objects. For example, N 8-byte objects can be allocated while using space approximately 8N * 1.01 bytes. I.e., a one-percent space overhead. ptmalloc2 uses a four-byte header for each object and (I think) rounds up the size to a multiple of 8 bytes and ends up using 16N bytes.

Links

Related Tags

98 questions
0
votes
1 answer

nix-info can't find libstdc++.so.6

I used nix for some stuff earlier and it worked. But now I keep getting this: $ nix-shell -p nix-info --run "nix-info" bash: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory If I…
unhammer
  • 4,306
  • 2
  • 39
  • 52
0
votes
1 answer

C++ importing libraries instead of linking?

I'm new to C++. When I write a program I expect it to compile into a standalone executable, but with C++ there's a lot of talk about dynamic and static linking. From what I gather this means the separate libraries used are compiled separately and…
Alasdair
  • 13,348
  • 18
  • 82
  • 138
0
votes
1 answer

How do jemalloc and tcmalloc track threads?

Now I am actively studying the code of memory managers jemalloc and tcmalloc. But I can't understand how these two managers track threads. If I understand correctly, a new thread can be detected during memory allocation, after which a new thread…
0
votes
1 answer

TCMALLOC memory leak

On Windows when I static bind tcmalloc with my code, I see continuous memory growth, but there is no growth if I do not use tcmalloc. issue is not present in Linux. I have tried flags below flags: TCMALLOC_RELEASE_RATE =…
0
votes
1 answer

what is the difference when linking against tcmalloc or not

This is a linkage question rather than a uwsgi question. But I will explain the story. I am using uwsgi to host my flask app. After running some weeks on production, I found that my app have some slight memory leak; its RSS size reported by uswgitop…
Alex
  • 1,737
  • 2
  • 20
  • 35
0
votes
1 answer

TCMalloc - get size of allocation for a pointer

Using TCMalloc - given heap allocated object, is there any way to get the allocated size of the object (meaning only the size passed in malloc call)? I'm asking for a "reliable" method (i.e, not going a word size back assuming the allocation size is…
Daniel Heilper
  • 1,182
  • 2
  • 17
  • 34
0
votes
1 answer

How close does tcmalloc come to pure stack allocation performance?

I was reasoning that if tcmalloc were maintaining a per-thread free list underneath from which dynamic allocations would be satisfied from then the performance of tcmalloc in the average case should be very close to stack allocation (the cost of…
Nathan Doromal
  • 3,437
  • 2
  • 24
  • 25
0
votes
1 answer

How to use TCMalloc on Google Cloud ML Engine

How to use TCMalloc on Google Cloud ML Engine? Or apart from TCMalloc, is there any other way to solve memory leak issues on ML Engine? Finalizing graph doesn't seem to help. Memory utilization graph: I've got out of memory error after training 73…
Fei
  • 23
  • 4
0
votes
2 answers

Safest Way to Link Google's TCMalloc lib

After some days of test I figured out that the runtime patching mechanism patch_functions.cc is not safe to use in a production environment. It seems to work well in a VS2010 project except for HeapAlloc() and HeapFree() but cannot be used in a…
fitzbutz
  • 956
  • 15
  • 33
0
votes
0 answers

MongoDB FATAL ERROR: Out of memory trying to allocate internal tcmalloc data

I am running mongodb 3.2 on ubuntu 14.04 server 64 bit. The mongodb server keeps crashing. Whenever I restart the server I see this: stop: Unknown instance: mongod start/running, process 25687 Also on running mongo shell after this I get the…
Manish Gupta
  • 4,438
  • 18
  • 57
  • 104
0
votes
1 answer

undefine reference to tcmalloc public API

I've cloned google-perf git tree. > ./autogen.sh > ./configure --enable-frame-pointers --prefix=/usr/ > make > sudo make install All steps above were successful. I can see the header files in /usr/include/gperftools/tcmalloc.h etc My program …
eswaat
  • 733
  • 1
  • 13
  • 31
0
votes
2 answers

What's the difference between malloc and tc_malloc?

for a code main.c: #include #include int main() { void* p = malloc(1000); free(p); return(0); } 1st compile: gcc main.c -o a.out 2nd compile: gcc main.c -ltcmalloc -o a.out 1st use glibc stdlib,2nd use…
linrongbin
  • 2,967
  • 6
  • 31
  • 59
0
votes
1 answer

tcmalloc huge performance variance

Our multi-threaded server has hundreds connection threads that are responsible for IO handing and replying to the incoming requests. There is another asynchronous thread that runs relatively heavy tasks with many allocations from time to time (say…
Roman
  • 1,351
  • 11
  • 26
0
votes
1 answer

how to get tcmalloc static of all class

I am using tcmalloc lib for my application and I want to get all class information like how many object of that class, total size etc . There is one function DumpStats that give us all information(class information, page heap information, total…
eswaat
  • 733
  • 1
  • 13
  • 31
0
votes
1 answer

Why does tcmalloc fail when I compile and run this program with a shared library?

Code simular to code here: Why tcmalloc don't print function name, which provided via dlopen makefile: all: g++ -fPIC -g -c shared.cpp -ltcmalloc g++ -shared -o shared_libs/libshared.so -g shared.o -ltcmalloc g++ -L shared_libs/ -g main.cpp -ldl…