7

I run a 3-node glusterfs 3.10 cluster based on Heketi to automatically provision and deprovision storage via Kubernetes. Currently, there are 20 volumes active - most with the minimum allowed size of 10gb, but each having only a few hundred mb of data persisted. Each volume is replicated on two nodes (equivalent of a RAID-1).

However, the gluster processes on the nodes take up huge amounts of memory (~13gb) on each node. Creating a statedump and looking at the result, the volumes each use between 1 and 30mb of memory:

# for i in $(gluster volume list); do gluster volume statedump $i nfs; done
# grep mallinfo_uordblks -hn *.dump.*
11:mallinfo_uordblks=1959056
11:mallinfo_uordblks=20888896
11:mallinfo_uordblks=2793760
11:mallinfo_uordblks=23316944
11:mallinfo_uordblks=1917536
11:mallinfo_uordblks=29287872
11:mallinfo_uordblks=14807280
11:mallinfo_uordblks=2170592
11:mallinfo_uordblks=2077088
11:mallinfo_uordblks=15463760
11:mallinfo_uordblks=2030032
11:mallinfo_uordblks=2079856
11:mallinfo_uordblks=2079920
11:mallinfo_uordblks=2167808
11:mallinfo_uordblks=2396160
11:mallinfo_uordblks=34000240
11:mallinfo_uordblks=2649920
11:mallinfo_uordblks=1683776
11:mallinfo_uordblks=6316944

All volumes have the default settings for performance. For some reason, the cache-size is shown twice - once with 32mb and once with 128mb:

# gluster volume get <volumeId> all | grep performance | sort                      
performance.cache-capability-xattrs     true                                                                                            
performance.cache-ima-xattrs            true                                                                                            
performance.cache-invalidation          false                                                                                           
performance.cache-max-file-size         0                                                                                               
performance.cache-min-file-size         0                                                                                               
performance.cache-priority                                                                                                              
performance.cache-refresh-timeout       1                                                                                               
performance.cache-samba-metadata        false                                                                                           
performance.cache-size                  128MB                                                                                           
performance.cache-size                  32MB                                                                                            
performance.cache-swift-metadata        true                                                                                            
performance.client-io-threads           off                                                                                             
performance.enable-least-priority       on                                                                                              
performance.flush-behind                on                                                                                              
performance.force-readdirp              true                                                                                            
performance.high-prio-threads           16                                                                                              
performance.io-cache                    on                                                                                              
performance.io-thread-count             16                                                                                              
performance.lazy-open                   yes                                                                                             
performance.least-prio-threads          1                                                                                               
performance.low-prio-threads            16                                                                                              
performance.md-cache-timeout            1                                                                                               
performance.nfs.flush-behind            on                                                                                              
performance.nfs.io-cache                off                                                                                             
performance.nfs.io-threads              off                                                                                             
performance.nfs.quick-read              off                                                                                             
performance.nfs.read-ahead              off                                                                                             
performance.nfs.stat-prefetch           off                                                                                             
performance.nfs.strict-o-direct         off                                                                                             
performance.nfs.strict-write-ordering   off                                                                                             
performance.nfs.write-behind            on                                                                                              
performance.nfs.write-behind-window-size1MB                                                                                             
performance.normal-prio-threads         16                                                                                              
performance.open-behind                 on                                                                                              
performance.parallel-readdir            off                                                                                             
performance.quick-read                  on                                                                                              
performance.rda-cache-limit             10MB                                                                                            
performance.rda-high-wmark              128KB                                                                                           
performance.rda-low-wmark               4096                                                                                            
performance.rda-request-size            131072                                                                                          
performance.read-after-open             no                                                                                              
performance.read-ahead                  on                                                                                              
performance.read-ahead-page-count       4                                                                                               
performance.readdir-ahead               on                                                                                              
performance.resync-failed-syncs-after-fsyncoff                                                                                          
performance.stat-prefetch               on                                                                                              
performance.strict-o-direct             off                                                                                             
performance.strict-write-ordering       off                                                                                             
performance.write-behind                on                                                                                              
performance.write-behind-window-size    1MB                                   

Still, even when adding up all caches and values, I'm still only at 2.5gb memory per node I can account for.

Restarting the daemons does not reduce the memory usage and I did not find any further information on how to reduce the memory. Having 750mb or memory per volume simply seems excessive and would lead to serious problems very soon.

Any hints?

Lars
  • 486
  • 5
  • 21

0 Answers0