I have an azure synapse spark cluster with 3 nodes of 4 vCores and 32 GB memory each. I am trying to submit a spark job using azure synapse Livy batch APIs. The request looks like this,
curl --location --request POST 'https://<synapse-workspace>.dev.azuresynapse.net/livyApi/versions/2019-11-01-preview/sparkPools/<pool-name>/batches?detailed=true' `
--header 'cache-control: no-cache' `
--header 'Authorization: Bearer <Token>' `
--header 'Content-Type: application/json' `
--data-raw '{
"name": "T1",
"file": "folder/file.py",
"driverMemory": "1g",
"driverCores": 1,
"executorMemory": "1g",
"executorCores":1,
"numExecutors": 3
}'
The response I get is this,
{
"TraceId": "<some-guid>",
"Message": "Your Spark job requested 16 vcores. However, the pool has a 12 core limit. Try reducing the numbers of vcores requested or increasing your pool size."
}
I cannot figure out why is it asking for 16 cores. Shouldn't it ask for 4 (3 * 1 + 1) cores?
Update: I tried changing the node pool size to 3 nodes each of 8 vCores and 64 GB memory. And, with this configuration,
{
"name": "T1",
"file": "folder/file.py",
"driverMemory": "1g",
"driverCores": 1,
"executorMemory": "1g",
"executorCores": 1,
"numExecutors": 6
}
It requests for 28 cores (even for executorCores 2,3,4). And if I change executorCores to 5,6,7 or 8, it will request for 56 cores.