I am building an openstack cluster and am having some issues with what I think may be a quota problem. I can successfully build vms on every host, but only one vm per host.
I deployed the system using puppet. and the current openstack version deployed is Ussuri. openstack puppet modules used are 17.4 with the exception of puppet-vswitch which uses 13.4
Each compute host(hypervisor) has 64 cores and 512GB of RAM. Even if I spin up a 2 core vm, i cant spin up any more on that hypervisor and I get the following error in the logs:
scheduler.log:
"status": 409, "title": "Conflict", "detail": "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'VCPU' on resource provider
nova-conductor.log
2021-05-24 15:31:21.770 31421 ERROR nova.conductor.manager [req-18e93e25-5cc2-43b6-a036-312ed064070b 9f72d8a0694146288eb09ac7fee38298 7016985dddfe4048b535ca7ff12a0c68 - default default] Failed to schedule instances: nova.exception_Remote.NoValidHost_Remote: No valid host was found. There are not enough hosts available.
I have checked and re-checked the quotas for this project and the number of instances is set to 10000 so im not sure what im missing:
| fixed-ips | 10000
| floating_ips | None
| health_monitors | None
| injected-file-size | 10240
| injected-files | 5
| injected-path-size | 255
| instances | 10000
| key-pairs | 100
| project_name | admin
| properties | 128
| ram | 99999999
Im not too sure what else i can possibly check and from the searches ive done, no one else seems to have run into something like this so im hoping its a simple setting im missing.
EDIT 5-26-21: I ran some more tests and I have found an interesting pattern.
If I put a 1 core machine(flavor m1.nano) on a compute host, i can build as many virtual machines as I want, any flavor that I want, until the machine runs of of resources physically.
If I create anything larger than a 1 core vm, and that vm is started on a compute host that does not have a 1 core vm already, any other vm built on this host will fail after a single machine being placed.
Other than telling me it cant allocate vcpus when it does fail, the logs aren't helping whatsoever.
Edited to add deployment method and openstack version.
Thanks in advance! -Jeff