Questions tagged [tcmalloc]

TCMalloc is a malloc library developed by Google. It is faster than the glibc 2.3 malloc (ptmalloc2), which takes approximately 300ns to execute a malloc/free pair on a 2.8GHz P4 (for small objects). TCMalloc takes approximately 50ns for the same operation pair. It also reduces lock contention for multi-threaded programs. For small objects, there is virtually zero contention. Another benefit is space-efficient representation of small objects.

Introduction

TCMalloc (Thread-Caching malloc) is a (memory allocation) library developed by Google. It is part of the gperftools (Google Performance Tools) project. Other tools in the same project include a heap checker (detecting memory leaks), a heap profiler (getting statistics for memory usage) and a CPU profiler (getting statistics for CPU usage).

Official Introduction by Sanjay Ghemawat

TCMalloc is faster than the glibc 2.3 malloc (available as a separate library called ptmalloc2) and other mallocs that I have tested. ptmalloc2 takes approximately 300 nanoseconds to execute a malloc/free pair on a 2.8 GHz P4 (for small objects). The TCMalloc implementation takes approximately 50 nanoseconds for the same operation pair. Speed is important for a malloc implementation because if malloc is not fast enough, application writers are inclined to write their own custom free lists on top of malloc. This can lead to extra complexity, and more memory usage unless the application writer is very careful to appropriately size the free lists and scavenge idle objects out of the free list.

TCMalloc also reduces lock contention for multi-threaded programs. For small objects, there is virtually zero contention. For large objects, TCMalloc tries to use fine grained and efficient spinlocks. ptmalloc2 also reduces lock contention by using per-thread arenas but there is a big problem with ptmalloc2's use of per-thread arenas. In ptmalloc2 memory can never move from one arena to another. This can lead to huge amounts of wasted space. For example, in one Google application, the first phase would allocate approximately 300MB of memory for its URL canonicalization data structures. When the first phase finished, a second phase would be started in the same address space. If this second phase was assigned a different arena than the one used by the first phase, this phase would not reuse any of the memory left after the first phase and would add another 300MB to the address space. Similar memory blowup problems were also noticed in other applications.

Another benefit of TCMalloc is space-efficient representation of small objects. For example, N 8-byte objects can be allocated while using space approximately 8N * 1.01 bytes. I.e., a one-percent space overhead. ptmalloc2 uses a four-byte header for each object and (I think) rounds up the size to a multiple of 8 bytes and ends up using 16N bytes.

Links

Related Tags

98 questions
4
votes
1 answer

Relationship between dwPageSize and dwAllocationGranularity

I’m reading the Google’s TCMalloc source code (the Windows porting). int getpagesize() { static int pagesize = 0; if (pagesize == 0) { SYSTEM_INFO system_info; GetSystemInfo(&system_info); pagesize =…
fitzbutz
  • 956
  • 15
  • 33
4
votes
0 answers

Performance of tcmalloc/jemalloc over windows 7's system malloc

I have replaced Windows 7's system allocator both with tcmalloc and jemalloc but I see that the system allocator performs better on a multithreaded app. In case of tcmalloc it seems that system alloc is about 10% faster in malloc and free operations…
user1447647
  • 49
  • 1
  • 2
3
votes
1 answer

Using tcmalloc in a shared library

I have many executables that are linked with tcmalloc (.a). I usually do it at the executable level, so that any shared library loaded by the executable benefits from tcmalloc. However, I have a scenario where I need to provide a .so library to an…
Uraza
  • 556
  • 3
  • 17
3
votes
1 answer

What does "TCMalloc currently does not return any memory to the system. " mean?

At the http://goog-perftools.sourceforge.net/doc/tcmalloc.html it is stated: "CMalloc currently does not return any memory to the system." I presume that it means that if I allocate 42 mb and free it system wont get it back, but next time i allocate…
NoSenseEtAl
  • 28,205
  • 28
  • 128
  • 277
3
votes
2 answers

TCMalloc Allocator for STL

I want to use TCMalloc with STL containers, so I need an allocator built with TCMalloc (like tbb_allocator with TBB malloc). I cannot find any anything TCMalloc documentation (if it is called a documentation). So I start to explore the header files…
ali_bahoo
  • 4,732
  • 6
  • 41
  • 63
3
votes
2 answers

Install tcmalloc on CentOS

I installed tcmalloc on CentOS using the command: sudo yum install google-perftools And it proceeds correctly. But I cannot find any installed perftools libraries in /usr/lib/, so I cannot set LD_PRELOAD variable. Then, when I tried to compile…
hiimdaosui
  • 361
  • 2
  • 12
3
votes
1 answer

address sanitizer (-fsanitize=address) works with tcmalloc?

I would like to know -fsanitize=address option of gcc works with tcmalloc? or do we need to run by disabling tcmalloc? Or is it will be good if sanitizer is run enabling tcmalloc?
Nasir
  • 708
  • 2
  • 11
  • 28
3
votes
2 answers

Using tcmalloc - How to load the malloc extensions properly?

In file gperftools-2.2.1/src/gperftools/malloc_extension.h, it reads: // Extra extensions exported by some malloc implementations. These // extensions are accessed through a virtual base class so an // application can link against a malloc that…
ptrgreen
  • 103
  • 1
  • 7
3
votes
1 answer

tcmalloc ReleaseFreeMemory() not releasing the memory properly

I'm using tcmalloc in one of my application in which the heap grow and shrink in very large amount, obviously I faced the issue where tcmalloc is not releasing the memory back to OS. Now I tried using the api to do that using…
sarath
  • 513
  • 6
  • 18
3
votes
2 answers

tcmalloc's fragmentation

Our software implement a actor model system, and we allocate/deallocate the small object very often.I am very sure the each object be destroyed without memory leak. ( I have used valgrind and tcmalloc tool to check the memory leak in my software. No…
xiaoningyb
  • 41
  • 1
  • 3
3
votes
0 answers

Should I use tcmalloc/jemalloc replace memory pool?

In my project, I use memory pool (boost::pool), but it eat too much memory(measured 22G), so I want to use tcmalloc/jemalloc replace memory pool, I consider the sides below: 1.performance 2.memory use I think if I use tcmalloc/jemalloc replace…
superK
  • 3,932
  • 6
  • 30
  • 54
3
votes
1 answer

tcmalloc not generating stack traces

I am running a binary linked with tcmalloc and it is not generating a stack trace for leaks it is detecting. The output says: The 1 largest leaks: Leak of 1401231 bytes in 82093 objects allocated from: If the preceding stack traces are not enough…
ATemp
  • 319
  • 3
  • 10
2
votes
0 answers

tcmalloc by default, but overridable

The common usage of tcmalloc vs glibc is "glibc malloc/free is the default; use LD_PRELOAD to use tcmalloc". An application I'm working on, they want the reverse: tcmalloc by default, but glibc's malloc/free as an option. (Environment is RHEL7, gcc…
Underhill
  • 408
  • 2
  • 13
2
votes
1 answer

Install tcmalloc from source to link without bazel?

I'd like to install tcmalloc from source. I'm on centos8. I'd install from yum but don't see any google-perf or gperf or anything of the sort available. (I did do yum check-update.) The instructions on the tcmalloc github sure are simple. Install…
user2183336
  • 706
  • 8
  • 19
2
votes
0 answers

tcmalloc: large allog error using Google Colab

I'm using the following tutorial to build a neural language model on the Google Colab platform: https://machinelearningmastery.com/how-to-develop-a-word-level-neural-language-model-in-keras/. My dataset, which contains 2036456 sequences and a…