0

The following excerpt from The Definitive Guide provides high level details as shown below but

  1. what exactly is virtual memory is referring to in this task counter?
  2. How to interpret it? How is it related to PHYSICAL_MEMORY_BYTES?

enter image description here

Following is an example extract from one of the jobs. Physical is 214 GB approx. and virtual is 611 GB approx.

enter image description here

Aravind Yarram
  • 78,777
  • 46
  • 231
  • 327

1 Answers1

2

1.What exactly is virtual memory is referring to in this task counter?

 Virtual Memory here is used to prevent Out of Memory errors of a task,if data size doesn't fits in RAM(physical mem).
  in RAM.So a portion of memory of size what didn't fit in RAM will be used as Virtual Memory.

So,while setting up hadoop cluster one is advised to have the value of vm.swappiness =1 to achieve better performance. On linux systems, vm.swappiness is set to 60 by default. Higher the value more aggresive swapping of memory pages.

https://community.hortonworks.com/articles/33522/swappiness-setting-recommendation.html

2. How to interpret it? How is it related to PHYSICAL_MEMORY_BYTES?

swapping of memory pages from physical memory to virtual memory on disk when not enough phy mem

This is the relation between PHYSICAL_MEMORY_BYTES and VIRTUAL_MEMORY_BYTES.

Taha Naqvi
  • 1,756
  • 14
  • 24
  • How do u think i should interpret the values i have in the question? the vm bytes are way higher than physical mem. – Aravind Yarram Nov 24 '18 at 15:24
  • vm bytes is sum of ram plus swap space used. The difference between pm vytes and vm bytes is swap space 611 -214 =397 G – Taha Naqvi Nov 24 '18 at 18:05
  • Dr Elephant is good option for analysing mr/tez/spark jobs for more clarity..they collect these counters and provide a report https://github.com/linkedin/dr-elephant/wiki – Taha Naqvi Nov 24 '18 at 18:17