0

I am trying to tweak below Hawq configurations at session level for a query-

SET hawq_rm_stmt_nvseg = 40;
SET hawq_rm_stmt_vseg_memory = '4gb';

Hawq is running on Yarn resource manager with

Minumum Hawq queue Used capacity 5%
hawq_rm_nvseg_perquery_perseg_limit = 6 
hawq_rm_min_resource_perseg = 4

When running my query i see only 30 containers being launched. Should it not be 40 containers (1 core per virtual segments)? Please help me understand how virtual segments memory or cores are allocated?

S. K
  • 495
  • 2
  • 7
  • 14

1 Answers1

1

hawq_rm_stmt_nvseg is a quota limit. By default, this is 0. So setting this to 40 won't increase the number of vsegs but instead, limit it.

hawq_rm_nvseg_perquery_perseg_limit controls how many vsegs can be created and you are using the default of 6. So the number of vsegs should be 6 * number of nodes. If you see 30, then you probably have 5 nodes.

If you are using randomly distributed tables, you can increase hawq_rm_nvseg_perquery_perseg_limit to get more vsegs to work on your query.

If you are using hash distributed tables, you can recreate the table with a larger bucketnum value which will give you more vsegs when you query it.

Jon Roberts
  • 2,068
  • 1
  • 9
  • 11
  • Number of nodes (Physical HAWQ segments) are 10. So ideally it should be 40 vsegs (4 per Physical seg) resulting into 40 yarn containers but i am seeing 30 yarn containers. Am i missing anything here? – S. K May 05 '17 at 18:59
  • I noticed you have hawq_rm_min_resource_perseg set to 4 so 4*10 = 40. The number of vsegs is dynamic and based on the table being accessed. A small table won't need to use more vsegs. More detail on Yarn integration: http://hdb.docs.pivotal.io/220/hawq/resourcemgmt/YARNIntegration.html – Jon Roberts May 08 '17 at 14:48