I was reading about the deployment tools you get out of the box when using MLflow.
It seems you can serve a model from the registry by doing something like
mlflow models serve -m runs:/<run_id>/model <some server>:<some port>
.
and then, you send some data and get back a prediction by doing
curl http:/<some server>:<some port>/invocations -H 'Content-Type: application/json' -d '{
"inputs": {<some data>}
}'.
This is great but does this mean you can only serve one model on one port? Is this a best practice or a technical limitation or not the case?