4

I am running into serious memory issues with my kdb process. Here is the architecture in brief.

The process runs in slave mode (4 slaves). It loads a ton of data from database into memory initially (total size of all variables loaded in memory calculated from -22! is approx 11G). Initially this matches .Q.w[] and close to unix process memory usage. This data set increases by very little in incremental operations. However, after a long operation, although the kdb internal memory stats (.Q.w[]) show expected memory usage (both used and heap) ~ 13 G, the process is consuming close to 25G on the system (unix /proc, top) eventually running out of physical memory.

Now, when I run garbage collection manually (.Q.gc[]), it frees up memory and brings unix process usage close to heap number displayed by .Q.w[].

I am running Q 2.7 version with -g 1 option to run garbage collection in immediate mode.

Why is unix process usage so significantly differently from kdb internal statistic -- where is the difference coming from? Why is "-g 1" option not working? When i run a simple example, it works fine. But in this case, it seems to leak a lot of memory.

I tried with 2.6 version which is supposed to have automated garbage collection. Suprisingly, there is still a huge difference between used and heap numbers from .Q.w when running with version 2.6 both in single threaded (each) and multi threaded modes (peach). Any ideas?

1 Answers1

3

I am not sure of the concrete answer but this is my deduction based on following information (and some practical experiments) which is mentioned on wiki: http://code.kx.com/q/ref/control/#peach It says:

Memory Usage

Each slave thread has its own heap, a minimum of 64MB.

Since kdb 2.7 2011.09.21, .Q.gc[] in the main thread executes gc in the slave threads too.

Automatic garbage collection within each thread (triggered by a wsful, or hitting the artificial heap limit as specified with -w on the command line) is only executed for that particular thread, not across all threads.

Symbols are internalized from a single memory area common to all threads.

My observations:

  1. Thread Specific Memory:

.Q.w[] only shows stats of main thread and not the summation of all the threads (total process memory). This could be tested by starting 'q' with 2 threads. Total memory in that case should be at least 128MB as per point 1 but .Q.w[] it still shows 64 MB.

That's why in your case at the start memory stats were close to unix stats as all the data was in main thread and nothing on other threads. After doing some operations some threads might have taken some memory (used/garbage) which is not shown by .Q.w[].

  1. Garbage collector call

As mentioned on wiki, calling garbage collector on main thread calls GC on all threads. So that might have collected the garbage memory from threads and reduced the total memory usage which was reflected by reduced unix memory stats.

Community
  • 1
  • 1
Rahul
  • 3,914
  • 1
  • 14
  • 25
  • Are you aware of a method to run garbage collector for slaves in automatic mode? Is there a command line option. if I understand correctly, -g option only will do so for main thread. Alternatively, does -w limit apply to slaves as well? – user5637363 Dec 16 '15 at 02:10
  • '-w' applies to each thread. So each thread will have same memory limit. As per point 3 in Memory Usage description, automatic garbage collection for a specific thread will happen only for wsful case and when memory limit set by -w is reached. – Rahul Dec 16 '15 at 06:16
  • Ok. One issue with this approach is I want the limit to be different between main thread and slaves. I cannot impose the limit on main thread but only to slaves. Is there a way to do this? How would you go about fixing this issue? – user5637363 Dec 16 '15 at 13:45
  • No you can't that unless you are creating slave process by your own. Some options to look into a) Design your own load balancing framework. One good example for that http://code.kx.com/wiki/Cookbook/LoadBalancing . You have to tweak it a little to suit your requirements b) Check if new multiprocess functionality added in 3.1 gives that features. http://code.kx.com/wiki/Reference/peach#Peach_using_multiple_processes_.28Distributed_each.29 c) In current code include .Q.gc[] calls at right places of function (one option is at the function exit) so that it will reduce the memory usage. – Rahul Dec 17 '15 at 12:34
  • I am sharing a lot of data between master and slaves as slaves read global variables, so the multiple process model won't work for me. I am assuming in multi threaded model, slave threads directly access global variables and functions from main thread's heap space. In multi process model, all that data will need to be serialized/deserialized. I am going to experiment with some options and discuss my results here. Thanks – user5637363 Dec 17 '15 at 21:41
  • - shouldn't running .Q.gc[] bring heap space close to used space as shown by .Q.w[]? My used is still showing 14G whereas heap is 20G after running .Q.gc[]. Why the difference? – user5637363 Dec 18 '15 at 15:09
  • I tried with 2.6 version which is supposed to have automated garbage collection. Suprisingly, there is still a huge difference between used and heap numbers from .Q.w when running with version 2.6 both in single threaded (each) and multi threaded modes (peach). Any ideas? – user5637363 Dec 23 '15 at 21:46